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PBSCBIPTION 

Development of Novel Aoti-Microbial Agents Based on Bacteriophage Genomics 

BACKGROUND OF THE INVENTION 

The present invention relates to the field of antibacterial agents and the 
treatment of infections of animals or other complex organisms by bacteria. 

The frequency and spectrum of antibiotic-resistant infections have, in recent 
years, increased in both the hospital and community. Certain infections have become 
essentially unbeatable and are growing to epidemic proportions in the developing 
world as well as in institutional settings in the developed world. The staggering 
spread of antibiotic resistance in pathogenic bacteria has been attributed to microbial 
genetic characteristics, widespread use of antibiotic drugs, and changes in society that 
enhance the transmission of drug-resistant organisms. This spread of drug resistant 
microbes is leading to ever increasing morbidity, mortality and health-carc costs. 

Ironically, it is the very success of antibiotics, resulting in their widespread 
use, that has contributed the most to rising numbers of drug resistant bacterial strains. 
The longer a bacterial strain is exposed to a drug, the more likely it is to acquire 
resistance. Today, a total of 1 60 antibiotics, all based on a few basic chemical 
structures and targeting a small number of metabolic pathways, have found their way 
to market. Over-prescription of these drugs, as well as the failure of patients to 
comply with the complete antibiotic regimen, has lead to the rapid emergence of 
antibiotic resistant strains. Such misuse of prescriptions, careless use of antibiotics in 
virtually all commercial production of beef and fowl, and changing societal 
conditions, such as the growth of day-care centers, increased long-term care in 
hospitals, and increased mobility of the population, has provided an environment 
where drug-resistant microbes can emerge and spread. Thus, virtually all common 
infectious bacteria are becoming, or have already become, resistant to one or more 
groups of antibiotics. Such resistance now reaches all classes of antibiotics currently 
in use, including: P-lactams, fluoroquinolones, aminoglycosides, macrolide peptides, 
chloramphenicol, tetracyclines, rifampicin, folate inhibitors, glycopeptides, and _ : - - * " 
mupirocin. 

Over the last 45 years bacteria have adapted genetically to avoid the 
destruction/alteration of the essential pathways that these chemotherapeutic agents 
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target. Antibiotic resistant bacterial strains are now emerging at a higher rate than the 
rate at which new antibiotics are being developed The consequence of this dilemma 
has been a dramatic increase in the cost of treating infections what would otherwise 
easily succumb to routine antibiotic therapy. Furthermore, and perhaps most 
importantly, the emergence of multiple drug resistant pathogenic bacteria has led to a 
significant increase in morbidity and mortality, particularly in institutional settings. 

Most major pharmaceutical companies have on-going drug discovery 
programs for novel anti-microbials. These are based on screens for small molecule 
inhibitors (natural products, bacterial culture media, libraries of small molecules, 
combinatorial chemistry) of crucial metabolic pathways of the micro-organism of 
interest (e.g., bacteria, fungi, parasites, worms). The screening process is largely for 
cytotoxic compounds and in most cases is not based on a known mechanism of action 
of the compounds. Pharmaceutical companies have large programs in this area. 
Classical drug screening programs are being exhausted and many of these 
pharmaceutical companies arc looking towards rational drug design programs. 

Several small to mid-size biotechnology companies as well as large 
pharmaceutical companies have developed systematic high-throughput sequencing 
programs to decipher the genetic code of specific micro-organisms of interest. The 
goal is to identify, through sequencing, unique biochemical pathways or intermediates 
that are unique to the microorganism. Knowledge of this may, in turn, form the 
rationale for a drug discovery program based on the mechanism of action of the 
identified enzymes/proteins. Genome Therapeutics Corp., The Institute for Genome 
Research, Human Genome Sciences Inc., and other companies have such sequencing 
programs in place. However, one of the most critical steps in this approach is the 
ascertainment that the identified proteins and biochemical pathways are 1) non- 
redundant and essential for bacterial survival, and 2) constitute suitable and accessible 
targets for drug discovery. 
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SUMMARY OF THE INVENTION 

While animals such as humans are, on occasion, infected by pathogenic 
bacteria, bacteria also have natural enemies. A number of host-specific viruses, 
known as bacteriophages or phages, infect and kill bacteria in the natural 
environment Such bacteriophages generally have small compact genomes and 
bacteria are their exclusive hosts. Many known bacteria are host to a large number of 
bacteriophages that have been described in the literature. During the 1940's - 1960's, 
phage biology was an area of active research. As a testimony to this, the study of 
phages which infect and inhibit the enteric bacterium Escherichia coli (E. coli) 
contributed much to the early understanding of molecular biology and virology. 

As is generally understood, bacteriophage (or phages) are viruses that infect 
and kill bacteria. They are natural enemies of bacteria and, over the course of 
evolution, have developed proteins (products of DNA sequences) which enable them 
to infect a host bacteria, replicate their genetic material, usurp host metabolism, and 
ultimately kill their host. The scientific literature well documents the fact that many 
known bacteria have a large number of such bacteriophages (Ackermann and DuBow, 
1987) that can infect and kill them (for example, see the ATCC bacteriophage 
collection at http://www.atcc.org). 

This invention utilizes the observation that bacteriophages successfully infect 
and inhibit or kill host bacteria, targeting a variety of normal host metabolic and 
physiological traits, some of which are shared by all bacteria, pathogenic and 
nonpathogenic alike. The term "pathogenic" as used herein denotes a contribution to 
or implication in disease or a morbid state of an infected organism. The invention 
thus involves identifying and elucidating the molecular mechanisms by which phages 
interfere with host bacterial metabolism, an objective being to provide novel targets 
for drug design. Whether the phage blocks bacterial RNA transcription or translation, 
or attacks other important metabolic pathways, such as cell wall assembly or 
membrane integrity, the basic blueprint for a phage's bacteria-inhibiting ability is 
encoded in its genome and can be unlocked using bioinformatics, functional 
genomics, and proteomics. By these means, the invention utilizes sequence 
information from the genomics of bacteriophage to identify novel antimicrobials that 
can be further used to actively and/or prophylactically treat bacterial infection. 

Two important components of the invention thus are: i) the identification of 
bacteria-inhibiting phage open reading frames ("ORF's) and corresponding products 
that can be used to develop antibiotics based on amino acid sequence and secondary 
structural characteristics of the ORF products, and ii) the use of bacteriophages to map 
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out essential bacterial target genes and horaologs, which can in turn lead to the 
development of suitable anti-microbial agents. These two avenues represent new and 
general methods for developing novel antimicrobials. 

The invention thus concerns the identification of bacteriophage ORFs that 
supply bacteria-inhibiting functions. In this regard, use of the terms "inhibit**, 
"inhibition", "inhibitory", and "inhibitor" all refer to a function of reducing a 
biological activity or function. Such reduction in activity or function can, for 
example, be in connection with a cellular component, e.g., an enzyme, or in 
connection with a cellular process, e.g., synthesis of a particular protein, or in 
connection with an overall process of a cell, e.g., cell growth. In reference to bacterial 
cell growth, for example, an inhibitory effect (i.e., a bacteria-inhibiting effect) may be 
bacteriocidal (killing of bacterial cells) or bacteriostatic (i.e., stopping or at least 
slowing bacterial cell growth). The latter slows or prevents cell growth such that 
fewer cells of the strain are produced relative to uninhibited cells over a given period 
of time. From a molecular standpoint, such inhibition may equate with a reduction in 
the level of, or elimination of, the transcription and/or translation of a specific 
bacterial targets), or reduction or elimination of activity of a particular target 
biomoleculc. 

It is particularly advantageous to evaluate a plurality of different phage ORFs 
for inhibitory activity that may be from one, but is preferably from a plurality of 
different phage. For example, evaluating ORFs from a number of different phage of 
the same bacterial host provides at least two advantages. One is that the multiple 
phages will provide identification of a variety of different targets. Second, it is likely 
that multiple phage will utilize the same cellular target 

As used herein, the terms "bacteriophage" and "phage" are used 
interchangeably to refer to a virus which can infect a bacterial strain or a number of 
different bacterial strains. 

In the context of this invention, the term "bacteriophage ORF" or ""phage 
ORF" or similar term refers to a nucleotide sequence in or from a bacteriophage. In 
connection with a particular ORF, the terms refer an open reading frame which has at 
least 95% sequence identity, preferably at least 97% sequence identity, more 
preferably at least 98% sequence identity with an ORF from the particular phage 
identified herein (e.g., with an ORF as identified herein) or to a nucleic acid sequence 
which has the specified sequence identify percentage with such an ORF sequence. 

A first aspect of the invention thus provides a method for identifying a „ 
bacteriophage nucleic acid coding region encoding a product active on an essential 
bacterial target by identifying a nucleic acid sequence encoding a gene product which 



WO 00/32825 



PCT/IB99/02040 



5 

provides a bacteria-inhibiting function when the bacteriophage infects a host 
bacterium, preferably one that is an animal or plant pathogen, more preferably a bird 
or mammalian pathogen, and most preferably a human pathogen. The bacteriophage 
is an uncharacterized bacteriophage. Thus, the method excludes, for example, phage 
K 4>xl74, ml 3 and other £.co/i-specific bacteriophage that have been studied with 
respect to gene number and/or function. It also excludes, for example, the nucleic 
acid coding regions described in Tables 12-14, and in preferred embodiments, 
excludes the phage in which those regions are naturally located. 

In connection with bacteriophage, the term ^characterized" means that a 
certain bacteriophage's genome has not yet been fully identified such that the genes 
having function involved in inhibiting host cells have not been identified. In 
particular, phage for which the description of genomic or protein sequence was first 
provided herein are uncharacterized. Phage sequences for which host bacteria- 
inhibiting functions have been identified prior to the filing of the present application 
(or alternatively prior to the present invention) are specifically excluded from the 
aspects involving utilization of sequences from uncharacterized bacteriophage, except 
that aspects may involve a plurality of phage where one or more of those phage are 
uncharacterized and one or more others have been characterized to some extent A 
number of different bacteria-inhibiting phage ORFs are indicated in Tables 1 1-14. 
The phage ORFs or sequences identified therein are not within the term 
"uncharacterized; alternatively, in preferred embodiments the phage containing those 
ORFs are excluded from this term. Further, any additional phage ORFs (or 
alternatively the phage which contain those ORFs) which have previously been 
described in the art as bacteria-inhibiting ORFs are expressly excluded; those ORFs or 
phage are known to those skilled in the art and the exclusion can be made express by 
specifically naming such ORFs or phage as needed (likewise for uncharacterized 
targets as described below). For the sake of brevity, such a listing is not expressly 
presented, as such information is readily available to those skilled in the art. 

Stating that an agent or compound is "active on" a particular cellular target, 
such as the product of a particular gene, means that the target is an important part of a 
cellular pathway which includes that target and that the agent acts on that pathway. 
Thus, in some cases the agent may act on a component upstream or downstream of the 
stated target, including on a regulator of that pathway or a component of that pathway. 
By "essential", in connection with a gene or gene product, is meant that the host 

cannot survive without, or is significantly growth compromised, in the abselice 
depletion, or alteration of functional product. An "essential gene" is thus one that 
encodes a product that is beneficial, or preferably necessary, for cellular growth in 
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vitro in a medium appropriate for growth of a strain having a wi Id-type allele 
corresponding to the particular gene in question. Therefore, if an essential gene is 
inactivated or inhibited, that cell will grow significantly more slowly, preferably less 
than 20%, more preferably less than 10%, most preferably less than 5% of the growth 
rate of the uninhibited wild-type, or not at all, in the growth medium. Preferably, in 
the absence of activity provided by a product of the gene, the cell will not grow at all 
or will be non-viable, at least under culture conditions similar to the in vivo conditions 
normally encountered by the bacterial cell during an infection. For example, absence 
of the biological activity of certain enzymes involved in bacterial cell wall synthesis 
can result in the lysis of cells under normal osmotic conditions, even though 
protoplasts can be maintained under controlled osmotic conditions. In the context of 
the invention, essential genes are generally the preferred targets of antimicrobial 
agents. Essential genes can encode target molecules directly or can encode a product 
involved in the production, modification, or maintenance of a target molecule. 

A "target" refers to a biomolecule that can be acted on by an exogenous agent, 
thereby modulating, preferably inhibiting, growth or viability of a cell. In most cases 
such a target will be a nucleic acid sequence or molecule, or a polypeptide or protein. 
However, other types of biomolecules can also be targets, e.g., membrane lipids and 
cell wall structural components. 

The term "bacterium" refers to a single bacterial strain, and includes a single 
cell, and a plurality or population of cells of that strain unless clearly indicated to the 
contrary. In reference to bacteria or bacteriophage, the term "strain" refers to bacteria 
or phage having a particular genetic content. The genetic content includes genomic 
content as well as recombinant vectors. Thus, for example, two otherwise identical 
bacterial cells would represent different strains if each contained a vector, e.g., a 
plasmid, with different phage ORF inserts. 

In preferred embodiments, the phage is Staphylococcus aureus phage 77, 3 A, 
96, or 44 AHJD, Enterococcus sp. phage 1 82, or Streptococcus pneumoniae phage 
Dp-1. 

In preferred embodiments, the phage is selected from. Preferred embodiments 
involve expressing at least one recombinant phage ORF(s) in a bacterial host followed 
by inhibition analysis of that host. Inhibition following expression of the phage ORF . - 
is indicative that the product of the ORF is active on an essential bacterial target. ~ 
Such evaluation can be carried out in a variety of different formats, such as on a 
support matrix such as a solidified medium in a pctri dish, or in liquid culture. 
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Preferably a plurality of phage ORFs are expressed in at least one bacterium. The 
plurality of phage ORFs can be from one or a plurality of phage. With respect to a 
single phage or at least one phage in a plurality of phages, the plurality of expressed 
ORFs preferably represents at least 10%, more preferably at least 20%, 40%, or 60%, 
still more preferably at least 80% or 90%, and most preferably at least 95% of the 
ORFs in the phage genome. Preferably, for a plurality of phage, the plurality of 
expressed ORFs preferably represents at least 10%, more preferably at least 20%, 
40%, or 60%, still more preferably at least 80% or 90%, and most preferably at least 
95% of the ORFs in the phage genome of each phage. The plurality of phage ORFs 
can be expressed in a single bacterium, or in a plurality of bacteria where one ORF is 
expressed in each bacterium, or in a plurality of bacteria where a plurality of ORFs are 
expressed in at least one or in all of the plurality of bacteria, or combinations of these. 

In embodiments of the above aspect (as well as in other aspects herein) in 
which a plurality of phage are utilized, a plurality of phage have the same bacterial 
host species; have different bacterial host species; or both. The plurality of phage 
includes at least two different phage, preferably at least 3,4,5,6,8,10,15,20, or more 
different phage. Indeed, more preferably, the plurality of phage will include 50, 75, 
100, or more phage. As described herein, the larger number of phage is useful to 
provide additional target and target evaluation information useful in developing 
antibacterial agents, for example, by providing identification of a larger range of 
bacterial targets, and/or providing further indication of the suitability of a particular 
target (for example, utilization of a target by a number of different unrelated phage 
can suggest that the target is particularly stable and accessible and effective) and/or 
can indicate alternate sites on a target which interact with different inhibitors. 

Further embodiments involve confirmation of the inhibitor function of the 
phage ORF, such as by utilizing or incorporating a control(s) designed to confirm the 
inhibitory nature of the ORF(s) being evaluated. The control can, for example, be 
provided by expression of an inactive or partially inactive form of the ORF or ORF 
product, and/or by the absence of expression of the ORF or ORF product in the same 
or a closely comparable bacterial strain as that used for expression of the test ORF. 
The reduced level of activity or the absence of active ORF product in the control will 
thus not provide the inhibition provided by a corresponding inhibitory ORF, or will 
provide a distinguishably lower level of inhibition. An inactivated or partially 
inactivated control has a mutation(s), e.g., in the coding region or in flanking 
regulatory elements, that reduce(s) or eliminate(s) the normal function of the ORF.~ : * 
Thus, the inhibition of a bacterium following expression of a phage ORF is 
determined by comparison with the effects of expression of an inactivated ORF or the 
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response of the bacteria in the absence of expression in the same or similar type 
bacterium. Such determination of inhibition of the bacterium following expression of 
the ORF is indicative of a bacteria-inhibiting function. These manipulations are 
routinely understood and accomplished by those of skill in the art using standard 
techniques. In embodiments utilizing absence of expression of the ORF, the bacteria 
can, for example, contain an empty vector or a vector which allows expression of an 
unrelated sequence which is preferably non-inhibitory. Alternatively, the bacteria 
may have no vector at all. Combinations of such controls or other controls may also 
be utilized as recognized by those skilled in the art. 

In embodiments involving expression of a phage ORF in a bacterial strain, in 
preferred embodiments that expression is inducible. 

By "inducible" is meant that expression is absent or occurs at a low level until 
the occurrence of an appropriate environmental stimulus provides otherwise. For the 
present invention such induction is preferably controlled by an artificial 
environmental change, such as by contacting a bacterial strain population with an 
inducing compound (i.e., an inducer). However, induction could also occur, for 
example, in response to build-up of a compound produced by the bacteria in the 
bacterial culture, e.g., in the medium. As uncontrolled or constitutive expression of 
inhibitory ORFs can severely compromise bacteria to the point of eradication, such 
expression is therefore undesirable in many cases because it would prevent effective 
evaluation of the strain and inhibitor being studied. For example, such uncontrolled 
expression could prevent any growth of the strain following insertion of a 
recombinant ORF, thus preventing determination of effective transfection or 
transformation. A controlled or inducible expression is therefore advantageous and is 
generally provided through the provision of suitable regulatory elements, e.g., 
promoter/operator sequences that can be conveniently transcriptionally linked to a 
coding sequence to be evaluated. In most cases, the vector will also contain 
sequences suitable for efficient replication of the vector in the same or different host 
cells and/or sequences allowing selection of cells containing the vector, i.e., 
"selectable markers." Further, preferred vectors include convenient primer sequences 
flanking the cloning region from which PCR and/or sequencing may be performed. 

As knowledge of the nucleotide sequence of phage ORFs is useful, e.g., for 
assisting in the identification of phage proteins active against essential bacterial host 
targets, preferred embodiments involve the sequencing of at least a portion of the 
phage genome in combination with the above methods. This can be done either-belbre 
or after or independent of expression and inhibition of the ORF in the bacteria, and 
provides information on the nature and characteristics of the ORF. Such a portion is 
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preferably at least 10%, 20%, 40%, 80%, 90%, or 100% of the phage genome. For 
embodiments in which a plurality of phage are utilized, preferably each phage is 
sequenced to an extent as just specified. 

Such sequencing is preferably accompanied by computer sequence analysis to 
define and evaluate ORF(s), ORF products, structural motifs or functional properties 
of ORF products, and/or their genetic control elements. Thus, certain embodiments 
incorporate computer sequence analyses or nucleic acid and/or amino acid sequences. 
Further, existing data banks can provide phage sequence and product information 
which can be utilized for analysis and identification of ORFs in the sequence. 
Computer analysis may further employ known homologous sequences from other 
species that suggest or indicate conserved underlying biochemical function(s) for the 
inhibitory or potentially inhibitory ORF sequcnce(s) being evaluated. This can 
include the sequences of signature motifs of identified classes of inhibitors. 

In the context of the phage nucleic acid sequences, e.g., gene sequences, of this 
invention, the terms "homolog" and "homologous" denote nucleotide sequences from 
different bacteria or phage strains or species or from other types of organisms that 
have significantly related nucleotide sequences, and consequently significantly related 
encoded gene products, preferably having related function. Homologous gene 
sequences or coding sequences have at least 70% sequence identity (as defined by the 
maximal base match in a computer-generated alignment of two or more nucleic acid 
sequences) over at least one sequence window of 48 nucleotides, more preferably at 
least 80 or 85%, still more preferably at least 90%, and most preferably at least 95%. 
The polypeptide products of homologous genes have at least 35% amino acid 
sequence identity over at least one sequence window of 18 amino acid residues, more 
preferably at least 40%, still more preferably at least 50% or 60%, and most 
preferably at least 70%, 80%, or 90%. Preferably, the homologous gene product is 
also a functional homolog, meaning that the homolog will functionally complement 
one or more biological activities of the product being compared. For nucleotide or 
amino acid sequence comparisons where a homology is defined by a % sequence 
identity, the percentage is determined using BLAST programs ( with default 
parameters (Altschul et al., 1997, "Gapped BLAST and PSI-BLAST: a new 
generation of protein database search programs, Nucleic Acid Res. 25:3389-3402). 
Any of a variety of algorithms known in the art which provide comparable results can 
also be used, preferably using default parameters. Performance characteristics for . . 
three different algorithms in homology searching is described in Salamov et al,, f$99, 
"Combining sensitive database searches with multiple intermediates to detect distant 
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homologues," Protein Eng. 12:95-100. Another exemplary program package is the 
GCG™ package from the University of Wisconsin. 

Homologs may also or in addition be characterized by the ability of two 
complementary nucleic acid strands to hybridize to each other under appropriately 
stringent conditions. Hybridizations are typically and preferably conducted with 
probe-length nucleic acid molecules, preferably 20-100 nucleotides in length. Those 
skilled in the art understand how to estimate and adjust the stringency of hybridization 
conditions such that sequences having at least a desired level of complementarity will 
stably hybridize, while those having lower complementarity will not. For examples of 
hybridization conditions and parameters, see, e.g.,. Maniatis, T. ct al. (1989) 
Molecular Clonino: A Lab oratory Manual. Cold Spring Harbor University Press, Cold 
Spring, N.Y.; Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biolopv . 
John Wiley & Sons, Secaucus, N.J. Homologs and homologous gene sequences may 
thus be identified using any nucleic acid sequence of interest, including the phage 
ORFs and bacterial target genes of the present invention. 

A typical hybridization, for example, utilizes, besides the labeled probe of 
interest, a salt solution such as 6xSSC (NaCl and Sodium Citrate base) to stabilize 
nucleic acid strand interaction, a mild detergent such as 0.5% SDS, together with 
other typical additives such as Denhardt's solution and salmon sperm DNA. The 
solution is added to the immobilized sequence to be probed and incubated at suitable 
temperatures to preferably permit specific binding while minimizing nonspecific 
binding. The temperature of the incubations and ensuing washes is critical to the 
success and clarity of the hybridization. Stringent conditions employ relatively higher 
temperatures, lower salt concentrations, and/or more detergent than do non-stringent 
conditions. Hybridization temperatures also depend on the length, complementarity 
level, and nature (ie, "GC content") of the sequences to be tested. Typical stringent 
hybridizations and washes are conducted at temperatures of at least 40°C, while lower 
stringency hybridizations and washes are typically conducted at 37°C down to room 
temperature (~25°C). One of skill in the art is aware that these conditions may vary 
according to the parameters indicated above, and that certain additives such as 
formamide and dextran sulphate may also be added to affect the conditions. 

By "stringent hybridization conditions" is meant hybridization conditions at 
least as stringent as the following: hybridization in 50% formamide, 5X SSC, 50 mM 
NaHJ>0 4 , pH 6.8, 0.5% SDS, 0.1 mg/mL sonicated salmon sperm DNA, and 5X 
Denhart's solution at 42°C overnight; washing with 2X SSC, 0.1% SDS at 45°G; and 
washing with 0.2X SSC, 0.1% SDS at 45°C. 
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In sequence comparison analyses, an ORF, or motif, or set of motifs in a 
bacteriophage sequence can be compared to known inhibitor sequences, e.g., 
homologous sequences encoding homologous inhibitors of bacterial function. 
Likewise, the analysis can include comparison with the structure of essential bacterial 
5 gene products, as structural similarities can be indicative of similar or replacement 
biological function. Such analysis can include the identification of a signature, or 
characteristic motifs) of an inhibitor or inhibitor class. 

Also, the identification of structural motifs in an encoded product, based on 
nucleotide or amino acid sequence analysis, can be used to infer a biochemical 
function for the product. A database containing identified structural motifs in a large 
number of sequences is available for identification of motifs in phage sequences. The 
database is PROSITE, which is available at www.expasy.ch/cgi^in/scanprosite. The 
identification of motifs can, for example, include the identification of signature motifs 
for a class or classes of inhibitory proteins. Other such databases may also be used. 

In aspects and preferred embodiments described herein, in which a bacterium 
or host bacterium is specified, the bacterium or host bacterium is preferably selected 
from a pathogenic bacterial species, for example, one selected from Table 1. 
Preferably, an animal or plant pathogen is used. For animals, preferably the bacterium 
is a bird or mammalian pathogen, still more preferably a human pathogen. 

In aspects and preferred embodiments involving a bacteriophage or sequences 
from a bacteriophage, one or more bacteriophage are preferably selected from those 
listed in Table 1. Those exemplary bacteriophge are readily obtained from the 
indicated sources. 

In some cases, it is advantageous to utilize phage with non-pathogenic host 
bacteria. The genome, structural motif, ORF, homolog, and other analyses described 
herein can be performed on such phage and bacteria. Such analysis provides useful 
information and compositions. The results of such analyses can also be utilized in 
aspects of the present invention to identify homologous ORFs, especially inhibitor 
ORFs in phage with pathogenic bacterial hosts. Similarly, identification of a target in 
a non-pathogenic host can be used to identify homologous sequences and targets in 
pathogenic bacteria, especially in genetically closely related bacteria. Those skilled in 
the an arc familiar with bacterial genetic relationships and with how to determine 
relatedness based on levels of genomic identity or other measures of nucleotide 
sequence and/or amino acid sequence similarity, and/or other physical and culture 
characteristics such as morphology, nutritional requirements, or minimal media-to~" 
support growth. 
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Also in preferred embodiments, an embodiments of this aspect is combined 
with an embodiment of the following aspect. 

A related aspect of the invention provides methods for identifying a target for 
antibacterial agents by identifying the bacterial target(s) of at least one 
uncharacterized or untargeted inhibitor protein or RNA from a bacteriophage. Such 
identification allows the development of antibacterial agents active on such targets. 
Preferred embodiments for identifying such targets involve the identification of 
binding of target and phage ORF products to one another. The phage ORF products 
may be subportions of a larger ORF product that also binds the host target. In 
preferred embodiments, the phage protein or RNA is from an uncharacterized 
bacteriophage in Table 1 . This aspect preferably includes the identification of a 
plurality of such targets in one or a plurality of different bacteria, preferably in one or 
a plurality of bacteria listed in Table 1 . 

In preferred embodiments of this aspect and other aspects of this invention 
involving particular phage ORFs or phage sequences, the ORF is Staphylococcus 
aureus phage 77 ORF 17, 19, 43, 102, 104, or 182 as identified in U.S. application 
09/407,804, S. aureus phage 44AHJD ORF I, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 041, or 
Enterococcus sp. phage 1 82 ORF 002, 008, or 0 1 4. 

As indicated for the above aspect, preferably the method involves the use of a 
plurality of different phage, and thus a plurality of different phage inhibitors and/or 
inhibitor ORFs. 

In addition to uncharacteized phage ORF products, it is also useful to identify 
the targets of phage ORF products which are known to be inhibitors of host bacteria, 
but where the target has not been identified. Thus, such inhibitors can likewise be 
utilized as "untargeted" inhibitor phage ORFs and ORF products, e.g., proteins or 
RNAs. 

In the context of inhibitor proteins or RNAs from a phage, the term 
^characterized" means that a bacteria-inhibiting function for the protein has not 
previously been identified. Preferably, but not necessarily, the sequence of the protein 
or the corresponding coding region or ORF was not described in the art before the 
filing of the present application for patent (or alternatively prior to the present 
invention). Thus, this term specifically excludes any bacteria-inhibiting phage protein 
and its associated bacterial target which has been identified as inhibitory before the 
present invention or alternatively before the filing of the present application, foe ~~* 
example those identified in Tables 12- 1 4 or otherwise identified herein. For example, 
from E. coli t phage T7 genes 0.7 and 2.0 target the host RNA polymerase, phage T4 
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gp55/gp33 alter the specificity of host RNA polymerase. The T4 regB gene product 
also targets the host translation apparatus. As with the uncharacterized bacteriophage 
ORFs or bacteriophage above, for such identified proteins, the sequences encoding 
those proteins are excluded from the uncharacterized inhibitor proteins. 

The term "fragment" refers to a portion of a larger molecule or assembly. For 
proteins, the term "fragment" refers to a molecule which includes at least 5 contiguous 
amino acids from the reference polypeptide or protein, preferably at least 8, 10, 12, 
15, 20, 30, 50 or more contiguous amino acids. In connection with oligo- or 
polynucleotides, the term "fragment" refers to a molecule which includes at least 15 
contiguous nucleotides from a reference polynucleotide, preferably at least 24, 30, 36, 
45, 60, 90, 1 50, or more contiguous nucleotides. 

Preferred embodiments involve identification of binding that include methods 
for distinguishing bound molecules, for example, affinity chromatography, 
immunoprecipitation, crosslinking, and/or genetic screen methods that permit 
protein rprotein interactions to be monitored. One of skill in the art is familiar with 
these techniques and common materials utilized (see, e.g., Coligan, J. et al. (eds.) 
(1995) Current Protocols in Protein Science. John Wiley & Sons, Secaucus, N.J.). 

Genetic screening for the identification of protein:protein interactions typically 
involves the co-introduction of both a chimeric bait nucleic acid sequence (here, the 
phage ORF to be tested) and a chimeric target nucleic acid sequence that, when co- 
expressed and having affinity for one another in a host cell, stimulate reporter gene 
expression to indicate the relationship. A "positive" can thus suggest a potential 
inhibitory effect in bacteria. This is discussed in further detail in the Detailed 
Description section below. In this way, new bacterial targets can be identified that are 
inhibited by specific phage ORF products or derivatives, fragments, mimetics, or 
other molecules. 

Other embodiments involve the identification and/or utilization of mutant 
targets by virtue of their host's relatively unresponsive nature in the presence of 
expression of ORFs previously identified as inhibitory to the non-mutant or wild-type 
strain. Such mutants have the effect of protecting the host from an inhibition that 
would otherwise occur and indirectly allow identification of the precise responsible 
target for follow-up studies and anti-microbial development. In certain embodiments, 
rescue from inhibition occurs under conditions in which a bacterial target or mutant 
target is highly expressed. This is performed, for example, through coupling of the 
sequence with regulatory element promoters, e.g., as known in the art, which regulate 
expression at levels higher than wild-type, e.g., at a level sufficiently higher that the 
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inhibitor can be competitively bound to the highly expressed target such that the 
bacterium is detectabiy less inhibited. 

Identification of the bacterial target can involve identification of a phage- 
specific site of action. This can involve a newly identified target, or a target where the 
5 phage site of action differs from the site of action of a previously known antibacterial 
agent or inhibitor. For example, phage T7 genes 0.7 and 2.0 target the host RNA 
polymerase, which is also the cellular target for the antibacterial agent, rifampin! To 
the extent that a phage product is found to act at a different site than previously 
described inhibitors, aspects of the present invention can utilize those new, phage- 
10 specific sites for identification and use of new agents. The site of action can be 
identified by techniques well-known to those skilled in the art, for example, by 
mutational analysis, binding competition analysis, and/or other appropriate 
techniques. 

Once a bacterial host target protein or nucleic acid or mutant target sequence 

1 5 has been identified and/or isolated, it too can be conveniently sequenced, sequence 
analyzed (e.g., by computer), and the underlying gene(s), and corresponding translated 
product(s) further characterized. Preferred embodiments include such analysis and 
identification. Preferably such a target has not previously been identified as an 
appropriate target for antibacterial action. 

20 Certain embodiments include the identification of at least one inhibitory phage 

ORF or ORF product, e.g., as described for the above aspect, and thus are a 
combination of the two aspects. 

Additionally, the invention provides methods for identifying targets for 
antibacterial agents by identifying homologs of a bacterial target e.g., S. aureus, 

25 Enterococcus faecalis or other Enterococci, and Streptococcus pneumoniae of a 

bacteriophage inhibitory ORF product. Such homologs may be utilized in the various 
aspects and embodiments described herein as describded for the host Enterococcus sp. 
for bacteriophage 1 82 . 

Other aspects of the invention provide isolated, purified, or enriched specific 

30 phage nucleic acid and amino acid sequences, subsequences, and homologs thereof for 
phage selected from uncharacterized phage listed in Table 1, preferably from 
bacteriophage 77, 3 A, 96, 44AHJD (Staphylococcus aureus host bacterium), Dp- 1 
(Streptococcus pneumoniae host), or 1 82 (Enterococcus host) or other phage listed in 
Table I for those bacteria. For example, such sequences do not include sequences - . 

35 identified in any of Tables 1 1-14. Nucleotide sequences of this aspect are at least~l5 
nucleotides in length, preferably at least 18, 21, 24, or 27 nucleotides in length, more 
preferably at least 30, 50, or 90 nucleotides in length. In certain embodiments, longer 
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nucleic acids arc preferred, for example those of at least 120, 150, 200, 300, 600, 900 
or more nucleotides. Such sequences can, for example, be amplification 
oligonucleotides (eg,, PCR primers), oligonucleotide probes, sequences encoding a 
portion or all of a phage-encoded protein, or a fragment or all of a phage-encoded 
protein. In preferred embodiments, the nucleic acid sequence contains a sequence 
which is within a length range with a lower length as specified above, and an upper 
length limit which is no more than 50, 60, 70, 80, or 90% of the length of the 
corresponding full-length ORF. The upper length limit can also be expressed in terms 
of the number of base pairs of the ORF (coding region). In preferred embodiments, 
the nucleic acid sequence is from Staphylococcus aureus phage 77 ORF 17, 19, 43, 
102, 104, or 182 as identified in U.S. application 09/407,804, S. aureus phage 44 
AHJD ORF 1, 9, or 12, Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 
008, 010, 013, 016, 021, 029, 030, 038, or 041, or Enterococcus sp. phage 182 ORF 
002, 008, or 014. 

As it is recognized that alternate codons will encode the same amino acid for 
most amino acids due to the degeneracy of the genetic code, the sequences of this 
aspect includes nucleic acid sequences utilizing such alternate codon usage for one or 
more codons of a coding sequence. For example, all four nucleic acid sequences 
GCT, GCC, GCA, and GCG encode the amino acid, alanine. Therefore, if for an 
amino acid there exists an average of three codons, a polypeptide of 100 amino acids 
in length will, on average, be encoded by 3 ,0 ° , or 5 x 10 47 , nucleic acid sequences. 
Thus, a nucleic acid sequence can be modified (e.g., a nucleic acid sequence from a 
phage as specified above) to form a second nucleic acid sequence encoding the same 
polypeptide as encoded by the first nucleic acid sequence using routine procedures 
and without undue experimentation. Thus, all possible nucleic acid sequences that 
encode the specified amino acid sequences are also fully described herein, as if all 
were written out in full, taking into account the codon usage, especially that preferred 
in the host bacterium. The alternate codon descriptions are available in common 
texbooks, for example, Stryer, BIOCHEMISTRY 3* ed., and Lehninger, 
BIOCHEMISTRY 3* cd., along wth many others. Codon preference tables for 
various types of organisms are available in the literature. Sequences with alternate 
codons at one or more sites can also be utilized in the computer-related aspects and 
embodiments herein. Because of the number of sequence variations involving 
alternate codon usage, for the sake of brevity, individual sequences are not separately 
listed herein. Instead the alternate sequences are described by reference to the natural 
sequence with replacement of one or more (up to all e.g.. up to 3, 5, 10, 15, 20, 30, 40, 
50, or more) of the degenerate codons with alternate codons from the alternate codon 
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table (Table 6), or a modified table applicable to a particular organism that has 
differing codon usage, preferably with selection according to preferred codon usage 
for the normal host organism or a host organism in which a sequence is intended to be 
expressed Those skilled in the art also understand how to alter the alternate codons to 
5 be used for expression in organisms where certain codons code differently than shown 
in the ''universal" codon table. 

For amino acid sequences or polypeptides, sequences contain at least 5 peptide- 
linked amino acid residues, and preferably at least 6, 7, 10, 15, 20, 30, or 40, amino 
acids having identical amino acid sequence as the same number of contiguous amino 
1 0 acid residues in a particular phage ORF product. In some cases longer sequences may 
be preferred, for example, those of at least 50, 60, 70, 80, or 100 amino acids in 
length. In preferred embodiments, the amino acid sequence contains a sequence which 
is within a length range with a lower length as specified above, and an upper length 
limit which is no more than 50, 60, 70, 80, or 90% of the length of the corresponding 
1 5 full-length ORF product. The upper length limit can also be expressed in terms of the 
number of amino acid residues of the ORF product. In preferred embodiments, the 
amino acid sequence or polypeptide has bacteria-inhibiting function when expressed 
or otherwise present in a bacterial cell which is a host for the bacteriophage from 
which the sequence was derived. 

By "isolated" in reference to a nucleic acid is meant that a naturally occurring 
sequence has been removed from its normal cellular (e.g., chromosomal) environment 
or is synthesized in a non-natural environment {e.g., artificially synthesized). Thus, 
the sequence may be in a cell-free solution or placed in a different cellular 
environment. The term does not imply that the sequence is the only nucleotide chain 
present, but that it is essentially free (about 90-95% pure at least) of non-nucleotide 
material naturally associated with it, and thus is distinguished from isolated 
chromosomes. 

The term "enriched" means that the specific DNA or RNA sequence 
constitutes a significantly higher fraction (2-5 fold) of the total DNA or RNA present 
in the cells or solution of interest than in normal or diseased cells or in cells from 
which the sequence was originally taken. This could be caused by a person by 
preferential reduction in the amount of other DNA or RNA present, or by a 
preferential increase in the amount of the specific DNA or RNA sequence, or by a 
combination of the two. However, it should be noted that enriched docs not imply 
that there are no other DNA or RNA sequences present, just that the relative amount" 
of the sequence of interest has been significantly increased. 
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The term "significant" is used to indicate that the level of increase is useful to 
the person making such an increase and an increase relative to other nucleic acids of 
about at least 2-fold, more preferably at least 5- to ! 0-fold or even more. The term 
also does not imply that there is no DNA or RNA from other sources. The other 
source DNA may, for example, comprise DNA from a yeast or bacterial genome, or a 
cloning vector such as pUCl 9. This term distinguishes from naturally occurring 
events, such as viral infection, or tumor type growths, in which the level of one 
mRNA may be naturally increased relative to other species of mRNA. That is, the 
term is meant to cover only those situations in which a person has intervened to 
elevate the proportion of the desired nucleic acid. 

It is also advantageous for some purposes that a nucleotide sequence be in 
purified form. The term "purified" in reference to nucleic acid does not require 
absolute purity (such as a homogeneous preparation). Instead, it represents an 
indication that the sequence is relatively more pure than in the natural environment 
(compared to the natural level, this level should be at least 2-5 fold greater, e.g., in 
terms of mg/mL). Individual clones isolated from a cDNA library may be purified to 
electrophoretic homogeneity. The claimed DNA molecules obtained from these 
clones could be obtained directly from total DNA or from total RNA. The cDNA 
clones are not naturally occurring, but rather are preferably obtained via manipulation 
of a partially purified naturally occurring substance (messenger RNA). The 
construction of a cDNA library from mRNA involves the creation of a synthetic 
substance (cDNA) and pure individual cDNA clones can be isolated from the 
synthetic library by clonal selection of the cells carrying the cDNA library. Thus, the 
process which includes the construction of a cDNA library from mRNA and isolation 
of distinct cDNA clones yields an approximately 10 6 -fold purification of the native 
message. Thus, purification of at least one order of magnitude, preferably two or 
three orders, and more preferably four or five orders of magnitude is expressly 
contemplated. 

The terms "isolated", "enriched", and "purified" as respect nucleic acids, 
above, may similarly be used to denote the relative purity and abundance of 
polypeptides ( multimers of amino acids joined one to another by a-carboxyliot-amino 
group (peptide) bonds). These, too, may be stored in, grown in, screened in, and 
selected from libraries using biochemical techniques familiar in the art. Such 
polypeptides may be natural, synthetic or chimeric and may be extracted using any of 
a variety of methods, such as antibody immunoprecipitation, other "tagging" - ~~ 
techniques, conventional chromatography and/or electrophoretic methods. Some of 
the above utilize the corresponding nucleic acid sequence. 
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As indicated above, aspects and embodiments of the invention are not limited 
to entire genes and proteins. The invention also provides and utilizes fragments and 
portions thereof, preferably those which are "active" in the inhibitory sense described 
above. Such peptides or oligopeptides and oligo or polynucleotides have preferred 
lengths as specified above for nucleic acid and amino acid sequences from phage; 
corresponding recombinant constructs can be made to express the encoded same. 
Also included are homologous sequences and fragments thereof. 

Nucleic acid sequences of the present invention can be isolated using a method 

similar to those described herein or other methods known to those skilled in the art. 
In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Also, by having particular phage ORFs, e.g., the phage ORFs 
identified herein (e.g., anti-bacterial ORFs of the present invention, portions thereof, 
or oligonucleotides derived therefrom as described), other antimicrobial sequences 
from other bacteriophage sources can be identified and isolated using methods 
described here or other methods, including methods utilizing nucleic acid 
hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage antimicrobial DNA segments from 
other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences that are 
highly homologous. The bacteriophage segment from a specific phage, e.g., an 
antimicrobial DNA segment, can be used to identify a related segment from another 
unrelated phage based on stringent conditions of hybridization or on being a homolog 
based on nucleic acid and/or amino acid sequence comparisons. As with identified 
inhibitory sequences, such homologous coding sequences and products can be used as 
antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

The nucleotide and amino acid sequences identified herein are believed to be 
correct, however, certain sequences may contain a small percentage of errors, e.g. t I- 
5%. In the event that any of the sequences have errors, the corrected sequences can be 
readily provided by one skilled in the an using routine methods. For example, the 
nucleotide sequences can be confirmed or corrected by obtaining and culturing the 
relevant phage, and purifying phage genomic nucleic acids: A region or regions of- •■ 
interest can be amplified, e.g, by PCR from the appropriate genomic template7 using 
primers based on the described sequence. The amplified regions can then be 
sequenced using any of the available methods (e.g. , a dideoxy termination method). 
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This can be done redundantly to provide the corrected sequence or to confirm that the 
described sequence is correct. Alternatively, a particular sequence or sequences can 
be identified and isolated as an insert or inserts in a phage genomic library and 
isolated, amplified, and sequenced by standard methods. Confirmation or correction 
5 of a nucleotide sequence for a phage gene provides an amino acid sequence of the 
encoded product by merely reading off the amino acid sequence according to the 
normal codon relationships and/or expressed in a standard expression system and the 
15 polypeptide product sequenced by standard techniques. The sequences described 

herein thus provide unique identification of the corresponding genes, coding 
10 sequences, and other sequences, allowing those sequences to be used in the various 
aspects of the present invention. 
20 In other aspects, the invention provides recombinant vectors and cells 

harboring at least one of the phage ORFs or portion thereof, or bacterial target 
sequences described herein. As understood by those skilled in the art, vectors may be 
I 5 provided in different forms, including, for example, plasmids, cosmids, and virus- 
25 based vectors. See, e.g. y Maniatis, T. et al. (1989) Modular Cloning; A Utoratpry 

ManuaL Cold Spring Harbor University Press, Cold Spring, N.Y.; See also, Ausubel, 
F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology. John Wiley & Sons, 
Secaucus, N. J. 

20 In preferred embodiments, the vectors will be expression vectors, preferably 

shuttle vectors that permit cloning, replication, and expression within bacteria. An 
"expression vector" is one having regulatory nucleotide sequences containing 
transcriptional and transiational regulatory information that controls expression of the 
nucleotide sequence in a host cell. Preferably the vector is constructed to allow 
35 25 amplification from vector sequences flanking an insert locus. In certain embodiments, 

the expression vectors may additionally or alternativley support expression, and/or 
replication in animal, plant and/or yeast cells due to the presence of suitable 
regulatory sequences, e.g., promoters, enhancers, 3* stabilizing sequences, primer 
40 sequences, etc. In preferred embodiments, the promoters are inducible and specific 

30 for the system in which expression is desired, e.g. , bacteria, animal, plant, or yeast. 
The vectors may optionally encode a "tag" sequence or sequences to facilitate protein 
purification. Convenient restriction enzyme cloning sites and suitable selective 
markers) are also optionally included. Such selective markers can be, for example, 
antibiotic resistance markers or markers which supply an essential nutritive growth 
35 factor to an otherwise deficient mutant host, e.g. , tryptophan, histidine, or leucine in ~ 
the Yeast Two-Hybrid systems described below. 

50 
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The term "recombinant vector" relates to a single- or double-stranded circular 
nucleic acid molecule that can be transfected into cells and replicated within or 
independently of a cell genome. A circular double-stranded nucleic acid molecule can 
be cut and thereby linearized upon treatment with appropriate restriction enzymes. An 
assortment of nucleic acid vectors, restriction enzymes, and the knowledge of the 
nucleotide sequences cut by restriction enzymes are readily available to those skilled 
in the art. A nucleic acid molecule encoding a desired product can be inserted into a 
vector by cutting the vector with restriction enzymes and ligating the two pieces 
together. Preferably the vector is an expression vector, e.g. t a shuttle expression 
vector as described above. 

By " recombinant cell" is meant a cell possessing introduced or engineered 
nucleic acid sequences, e.g., as described above. The sequence may be in the form of 
or part of a vector or may be integrated into the host cell genome. Preferably the cell 
is a bacterial cell. 

In another aspect, the invention also provides methods for identifying and/or 
screening compounds "active on" at least one bacterial target of a bacteriophage 
inhibitor protein or RNA. Preferred embodiments involve contacting such a bacterial 
target or targets (e.g., bacterial target proteins) with a test compound, and deternnning 
whether the compound binds to or reduces the level of activity of the bacterial target 
(e.g. t a bacterial target protein). Preferably this is done either in vivo (i.e., in a cell- 
based assay) or in vitro, e.g., in a cell-free system under approximately physiological 
conditions. 

The compounds that can be used may be large or small, synthetic or natural, 
organic or inorganic, protcinaceous or non-proteinaceous. In preferred embodiments, 
the compound is a peptidomimetic, as described herein, a bacteriophage inhibitor 
protein or fragment or derivative thereof, preferably an "active portion", or a small 
molecule. 

In preferred embodiments, the bacterial target is a target of a phage ORF 
identified herein, e.g., S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus 
pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041, or Enterococcus sp. phage 182 ORF 002, 008, or 014. 

In particular embodiments, the methods include the identification of bacterial 
targets or the site of action of an inhibitor on a bacterial target as described above or 
otherwise described herein. 

In embodiments involving binding assays, preferably binding is to a fragment *' 
or portion of a bacterial target protein, where the fragment includes less than 90%, 
80%, 70%, 60%, 50%, 40%, or 30% of an intact bacterial target protein. Preferably, 
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the at least one bacterial target includes a plurality of different targets of 
bacteriophage inhibitor proteins, preferably a plurality of different targets. The 
plurality of targets can be in or from a plurality of different bacteria, but preferably is 
from a single bacterial species. 

A "method of screening" refers to a method for evaluating a relevant activity 
or property of a large plurality of compounds (e.g., a bacteria-inhibiting activity), 
rather than just one or a few compounds. For example, a method of screening can be 
used to conveniently test at least 100, more preferably at least 1000, still more 
preferably at least 10,000, and most preferably at least 100,000 different compounds, 
or even more. 

In the context of this invention, the term "small molecule" refers to 
compounds having molecular mass of less than 2000 Daltons, preferably less than 
1500, still more preferably less than 1000, and most preferably less than 600 Daltons. 
Preferably but not necessarily, a small molecule is not an oligopeptide. 

In a related aspect or in preferred embodiments, the invention provides a 
method of screening for potential antibacterial agents by determining whether any of a 
plurality of compounds, preferably a plurality of small molecules, is active on at least 
one target of a bacteriophage inhibitor protein or RNA. Preferred embodiments 
include those described for the above aspect, including embodiments which involve 
determining whether one or more test compounds bind to or reduce the level of 
activity of a bacterial target, and embodiments which utilize a plurality of different 
targets as described above. 

The identification of bacteria- inhibiting phage ORFs and their encoded 
products also provides a method for identifying an active portion of such an encoded 
product. This also provides a method for identifying a potential antibacterial agent by 
identifying such an active portion of a phage ORF or ORF product. In preferred 
embodiments, the identification of an active portion involves one or more of 
mutational analysis, deletion analysis, or analysis of fragments of such products. The 
method can also include determination of a 3-dimensional structure of an active 
portion, such as by analysis of crystal diffraction patterns. In further embodiments, 
the method involves constructing or synthesizing a peptidomimetic compound, where 
the structure of the peptidomimetic compound corresponds to the structure of the 
active portion. In this context, "corresponds" means that the peptidomimetic 
compound structure has sufficient similarities to the structure of the active portion that 
the peptidomimetic will interact with the same molecule as the phage protein and~ : " 
preferably will elicit at least one cellular response in common which relates to the 
inhibition of the cell by the phage protein. 
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In preferred embodiments, the ORF or ORF product is or is derived or 
obtained from S. aureus phage 44AHJD ORF 1, 9, or 12, Streptococcus pneumoniae 
phage Dp-1 ORF 001, 002, 004. 008, 010, 013, 016, 021, 029, 030. 038, or 041, or 
Enterococcus sp. phage 182 ORF 002, 008, or 014 or product thereof. 
5 The methods for identifying or screening for compounds or agents active on a 

bacterial target of a phage-encoded inhibitor can also involve identification of a 
phage-specific site of action on the target. 

Preferably in the methods for identifying or screening for compounds active 
on such a bacterial target, the target is uncharacterized; the target is from an 
1 0 uncharacterized bacterium from Table 1 ; the site of action is a phage-specfic site of 
action. 

Further embodiments include the identification of inhibitor phage ORFs and 
bacterial targets as in aspects above. 

An "active portion" as used herein denotes an epitope, a catalytic or regulatory 
1 5 domain, or a fragment of a bacteriophage inhibitor protein that is responsible for, or a 
significant factor in, bacterial target inhibition. The active portion preferably may be 
removed from its contiguous sequences and, in isolation, still effect inhibition. 

By "mimetic" is meant a compound structurally and functionally related to a 
reference compound that can be natural, synthetic, or chimeric. In terms of the present 
invention, a "peptidomimetic," for example, is a compound that mimics the activity- 
related aspects of the 3-dimensional structure of a peptide or polyeptide in a non- 
peptide compound, for example mimics the structure of a peptide or active portion of 
a phage- or bacterial ORF-encoded polypeptide. 

A related aspect provides a method for inhibiting a bacterial cell by contacting 
the bacterial cell with a compound active on a bacterial target of a bacteriophage 
inhibitor protein or RNA, where the target was uncharacterized. In preferred 
embodiments, the compound is such a protein, or a fragment or derivative thereof; a 
structural mimetic, e.g., a peptidomimetic, of such a protein or fragment; a small 
molecule; the contacting is performed in vitro, the contacting is performed in vivo in 
an infected or at risk organism, e.g., an animal such as a mammal or bird, for example, 
a human, or other mammal described herein; the bacterium is selected from a genus 
and/or species listed in Table 1 ; the bacteriophage inhibitor protein is uncharacterized; 
the bacteriophage inhibitor protein is from an uncharacterized phage listed in Table I; 
the phage inhibitor protein is from one of S. aureus phage 44AHJD ORF 1, 9, or 12, 
Streptococcus pneumoniae phage Dp- 1 ORF 001, 002, 004, 008, 010, 013, 0\6r0lt t ~ 
029, 030, 038, or 041, or Enterococcus sp. phage 1 82 ORF 002, 008, or 014. 
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In the context of targets in this invention, the term "uncharacterized" means 
that the target was not recognized as an appropriate target for an antibacterial agent 
prior to the filing of the present application or alternatively prior to the present 
invention. Such lack of recognition can include, for example, situations where the 
target and/or a nucleotide sequence encoding the target were unknown, situations 
where the target was known, but where it had not been identified as an appropriate 
target or as an essential cellular component, and situations where the target was 
known as essential but had not been recognized as an appropriate target due to a belief 
that the target would be inaccessible or otherwise that contacting the cell with a 
compound active on the target in vitro would be ineffective in cellular inhibition, or 
ineffective in treatment of an infection. Methods described herein utilizing bacterial 
targets, e.g., for inhibiting bacteria or treating bacterial infections, can also utilize 
"uncharacterized target sites", meaning that the target has been previously recognized 
as an appropriate target for an antibacterial agent, but where an agent or inhibitor of 
the invention is used which acts at a different site than that at which the previously 
utilized antibacterial agent, i.e., a phage-specific site. Preferably the phage-specific 
site has different functional characteristics from the previously utilized site. In the 
context of targets or target sites, the term "phage-specific" indicates that the target or 
site is utilized by at least one bacteriophage as an inhibitory target and is different 
from previously identified targets or target sites. 

In the context of this invention, the term "bacteriophage inhibitor protein" 
refers to a protein encoded by a bacteriophage nucleic acid sequence which inhibits 
bacterial function in a host bacterium. Thus, it is a bacteria-inhibiting phage product. 

In the context of this invention, the phrase "contacting the bacterial cell with a 
compound active on a bacterial target of a bacteriophage inhibitor protein" or 
equivalent phrases refer to contacting with an isolated, purified, or enriched 
compound or a composition including such a compound, but specifically does not rely 
on contacting the bacterial cell with an intact phage which encodes the compound. 
Preferably no intact phage are involved in the contacting. 

Related aspects provide methods for prophylactic or therapeutic treatment of a 
bacterial infection by administering to an infected, challenged or at risk organism a 
therapeutically or prophylactically effective amount of a compound active on a target 
of a bacteriophage inhibitor protein or RNA, or as described for the previous aspect. 
Preferably the bacterium involved in the infection or risk of infection produces the 
identified target of the bacteriophage inhibitor protein or alternatively produces-a ~~ 
homologous target compound. In preferred embodiments, the host organism is a plant 
or animal, preferably a mammal or bird, and more preferably, a human or other 



WO 00/32*25 



PCT/IB99/02040 



10 



15 



24 

mammal described herein. Preferred embodiments include, without limitation, those 
as described for the preceding aspect. 

Compounds useful for the methods of inhibiting, methods of treating, and 
pharmaceutical compositions can include novel compounds, but can also include 
5 compounds which had previously been identified for a purpose other than inhibition 
of bacteria. Such compounds can be utilized as described and can be included in 
pharmaceutical compositions. 

In preferred embodiments of this and other aspects of the invention utilizing 
bacterial target sequences of a bacteriiophage inhibitory ORF product, the target 
10 sequence is encoded by a Staphylococcus nucleic acid coding sequence, preferably S. 
aureus, a Streptococcus nucleic acid coding sequence, preferably Streptococcus 
20 pneumoniae, or Enterococcus nucleic acid coding sequence. Possible target 

sequences are described herein by reference to sequence source sites. 

The amino acid sequence of a polypeptide target is readily provided by 
15 translating the corresponding coding region. For the sake of brevity, the sequences 
25 are not reproduced herein. For the sake of brevity, the sequences are described by 

reference to the GenBank entries instead of being written out in full herein. In cases 
where the TIGR or GenBank entry for a coding region is not complete, the complete 
sequence can be readily obtained by routine methods, e.g., by isolating a clone in a 
20 phage host genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 
35 25 In the context of nucleic acid or amino acid sequences of this invention, the 

term "corresponding" indicates that the sequence is at least 95% identical, preferably 
at least 97% identical, and more preferably at least 99% identical to a sequence from 
the specified phage genome, a ribonucleotide equivalent, a degenerate equivalent 
40 (utilizing one or more degenerate codons), or a homologous sequence, where the 

30 homolog provides functionally equivalent biological function. 

By "treatment" or "treating" is meant administering a compound or 
pharmaceutical composition for prophylactic and/or therapeutic purposes. The term 
prophylactic treatment" refers to treating a patient or animal that is not yet infected 
but is susceptible to or otherwise at risk of a bacterial infection. The term therapeutic 
35 treatment" refers to administering treatment to a patient already suffering from. ~" 
infection. 

50 
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The term "bacterial infection" refers to the invasion of the host organism, 
animal or plant, by pathogenic bacteria. This includes the excessive growth of bacteria 
which are normally present in or on the body of the organism, but more generally, a 
bacterial infection can be any situation in which the presence of a bacterial 
population(s) is damaging to a host organism. Thus, for example, an organism suffers 
from a bacterial population when excessive numbers of a bacterial population are 
present in or on the organism's body, or when the effects of the presence of a bacterial 
population(s) is damaging to the cells, tissue, or organs of the organism. 

The terms "administer", "administering", and "administration" refer to a 
method of giving a dosage of a compound or composition, e.g., an antibacterial 
pharmaceutical composition, to an organism. Where the organism is a mammal, the 
method is, e.g., topical, oral, intravenous, transdermal, intraperitoneal, intramuscular, 
or intrathecal; The preferred method of administration can vary depending on various 
factors, e.g., the components of the pharmaceutical composition, the site of the 
potential or actual bacterial infection, the bacterium involved, and the infection 
severity. 

The term "mammal" has its usual biological meaning referring to any 
organism of the Class Mammalia of higher vertebrates that nourish their young with 
milk secreted by mammary glands, e.g., mouse, rat, and, in particular, human, bovine, 
sheep, swine, dog, and cat 

In the context of treating a bacterial infection a "therapeutically effective 
amount" or '^harmaceutically effective amount" indicates an amount of an 
antibacterial agent, e.g., as disclosed for this invention, which has a therapeutic effect. 
This generally refers to the inhibition, to some extent, of the normal cellular 
functioning of bacterial cells that renders or contributes to bacterial infection. 

The dose of antibacterial agent that is useful as a treatment is a 
"therapeutically effective amount." Thus, as used herein, a therapeutically effective 
amount means an amount of an antibacterial agent that produces the desired 
therapeutic effect as judged by clinical trial results and/or animal models. This amount 
can be routinely determined by one skilled in the art and will vary depending on 
several factors, such as the particular bacterial strain involved and the particular 
antibacterial agent used. 

In connection with claims to methods of inhibiting bacteria and therapeutic or. 
prophylactic treatments, "a compound active on a target of a bacteriophage inhibitor 
protein" or terms of equivalent meaning differ from administration of or contacLwTth 
an intact phage naturally encoding the full-length inhibitor compound. While an 
intact phage may conceivably be incorporated in the present methods, the method at 
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least includes the use of an active compound as specified different from a full length 
inhibitor protein naturally encoded by a bacteriophage and/or a delivery or contacting 
method different from administration of or contact with an intact phage encoding the 
full-length protein. Similarly, pharmaceutical compositions described herein at least 
include an active compound different from a full-length inhibitor protein naturally 
encoded by a bacteriophage or such a full-length protein is provided in the 
composition in a form different from being encoded by an intact phage. Preferably 
the methods and compositions do not include an intact phage. 

In accord with the above aspects, the invention also provides antibacterial 
agents and compounds active on bacterial targets of bacteriophage inhibitor proteins 
or RNAs, where the target was uncharacterized as indicated above. As previously 
indicated, such active compounds include both novel compounds and compounds 
which had previously been identified for a purpose other than inhibition of bacteria. 
Such previously identified biologically active compounds can be used in 
embodiments of the above methods of inhibiting and treating. In preferred 
embodiments, the targets, bacteriophage, and active compound are as described herein 
for methods of inhibiting and methods of treating. Preferably the agent or compound 
is formulated in a pharmaceutical composition which includes a pharmaceutically 
acceptable carrier, excipient, or diluent. In addition, the invention provides agents, 
compounds, and pharmaceutical compositions where an active compound is active on 
an uncharacterized phage-specific site. 

In preferred embodiments, the target is as described for embodiments of 
aspects above. 

Likewise, the invention provides a method of making an antibacterial agent. 
The method involves identifying a target of a bacteriophage inhibitor polypeptide or 
protein or RNA, screening a plurality of compounds to identify a compound active on 
the target, and synthesizing the compound in an amount sufficient to provide a 
therapeutic effect when administered to an organism infected by a bacterium naturally 
producing the target. In preferred embodiments, the identification of the target and 
identification of active compounds include steps or methods and/or components as 
described above (or otherwise herein) for such identification. Likewise, the active 
compound can be as described above, including fragments and derivatives of phage 
inhibitor proteins, peptidomimetics, and small molecules. As recognized by those 
skilled in the art, peptides can be synthesized by expression systems and purified, or 
can be synthesized artificially. In preferred embodiments the inhibitory phage ORF : * 
products is from S. aureus phage 44AHJD ORF 1 , 9, or 1 2, Streptococcus 
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pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 029, 030, 038, 
or 041 , or Enterococcus sp. phage 1 82 ORF 002, 008, or 01 4. 

As indicated above, sequence analysis of nucleotide and/or amino acid 
sequences can beneficially utilize computer analysis. Thus, in additional aspects the 
invention provides computer-related hardware and media and methods utilizing and 
incorporating sequence data from uncharacterized phage, eg., uncharacterized phage 
listed in Table 1, preferably at least one of Staphylococcus aureus phage S. aureus 
phage 44 AH JD ORF l,9,or 12, Streptococcus pneumoniae phage Dp-l ORF 001, 
002, 004, 008, 010, 013, 016, 021, 029, 030, 038, or 04 1 , or Enterococcus sp. phage 
182 ORF 002, 008, or 014, or 44 AHJD, Enterococcus sp. phage 182, or 
Streptococcus pneumoniae phage Dp- 1 . In general, such aspects can facilitate the 
above-described aspects. Various embodiments involve the analysis of genetic 
sequence and encoded products, as applied to the evaluating bacteriophage inhibitor 
ORFs and compounds and fragments related thereto. The various sequence analyses, 
as well as function analyses, can be used separately or in combination, as well as in 
preceding aspects and embodiments. Use in combination is often advantageous as the 
additional information allows more efficient prioritizing of phage ORFs for 
identification of those ORFs that provide bacteria-inhibiting function. 

In one aspect, the invention provides a computer-readable device which 
includes at least one recorded amino acid or nucleotide sequence corresponding to one 
of the specified phage and a sequence analysis program for analyzing a nucleotide 
and/or amino acid sequence. The device is arranged such that the sequence 
information can be retrieved and analyzed using the analysis program. The analysis 
can identify, for example, homologous sequences or the indicated %s of the phage 
genome and structural motifs. Preferably the sequence includes at least 1 phage ORF 
or encoded product, more preferably at least 10%, 20%, 30%, 40%, 50%, 70%, 90%, 
or 100% of the genomic phage ORFs and/or equivalent cDNA, RNA, or amino acid 
sequences. Preferably the sequence or sequences in the device are recorded in a 
medium such as a floppy disk, a computer hard drive, an optical disk, computer 
random access memory (RAM), or magnetic tape. The program may also be recorded 
in such medium. The sequences can also include sequences from a plurality of 
different phage. 

In this context, the term "corresponding" indicates that the sequence is at least 
95% identical, preferably at least 97% identical, and more preferably at least 99% 
identical to a sequence from the specified phage genome, a ribonucleotide equivalent, 
a degenerate equivalent (utilizing one or more degenerate codons), or a homologous 
sequence, where the homolog provides functionally equivalent biological function. 
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Similarly, the invention provides a computer analysis system for identifying 
biologically important portions of a bacteriophage genome. The system includes a 
data storage medium, e.g., as identified above, which has recorded thereon a 
nucleotide sequence corresponding to at least a portion of at least one uncharacterized 
5 bacteriophage genome, a set of program instructions to allow searching of the 
sequence or sequences to analyze the sequence, and an output device where the 
portion includes at least the sequence length as specified in the preceding aspect. The 
15 output device is preferably a printer, a video display, or a recording medium. More 

one than one output device may be included. For each of the present computer-related 
10 asepcts, the bacteriophage are preferably selected from the uncharacterized phage 
listed in Table 1, more preferably from bacteriophage 77, 3A, 96, 44 AHJD (S. 
20 aureus), Dp-1 (Streptococcus pneumoniae), or 182 (Enter ococcus). 

In keeping with the computer device aspects, the invention also provides a 
method for identifying or characterizing a bacteriophage ORF by providing a 
15 computer-based system for analyzing. nucleotide or amino acid sequences, e.g., as 
25 describe above. The system includes a data storage medium which has recorded a 

sequences or sequences as described for the above devices, a set of instructions as in 
the preceding aspect, and an output device as in the preceding aspect. The method 
further involves analyzing at least one sequence, and outputting the analysis results to 
20 at least one output device. 

In preferred embodiments, the analysis identifies a sequence similarity or 
homology with a sequence or sequences selected from bacterial ORFs encoding 
products with related biological function; ORFs encoding known inhibitors; and 
essential bacterial ORFs. Preferably the analysis identifies a probable biological 
35 25 function based on identification of structural elements or characteristic or signature 

motifs of an encoded product or on sequence similarity or homology. Preferably the 
uncharacterized bacteriophage is from Table 1, more preferably at least one of 
bacteriophage 77, 3 A, 96, 44 AHJD (S. aureus), Dp-1 (Streptococcus pneumoniae), or 
40 1 82 (Enterococcus). In preferred embodiments, the method also involves determining 

30 at least a portion of the nucleotide sequence of at least one uncharacterized 

bacteriophage as indicated, and recording that sequence on data storage medium of the 
computer-based system. In preferred embodiments, the analysis identifies a sequence 
similarity of homology with a S. aureus phage 44 AHJD ORF 1 , 9, or 12, 
Streptococcus pneumoniae phage Dp-1 ORF 001, 002, 004, 008, 010, 013, 016, 021, 
35 029, 030, 038, or 041 , or Enterococcus sp. phage 1 82 ORF 002, 008, or 014. - " ~ 
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As used in the claims to describe the various inventive aspects and 
embodiments, "comprising" means including, but not limited to, whatever follows the 
word "comprising". Thus, use of the term "comprising" indicates that the listed 
elements are required or mandatory, but that other elements are optional and may or 
may not be present. By "consisting of is meant including, and limited to, whatever 
follows the phrase "consisting of*. Thus, the phrase "consisting of indicates that the 
listed elements are required or mandatory, and that no other elements may be present. 
By "consisting essentially of is meant including any elements listed after the phrase, 
and limited to other elements that do not interfere with or contribute to the activity or 
action specified in the disclosure for the listed elements. Thus, the phrase "consisting 
essentially of indicates that the listed elements are required or mandatory, but that 
other elements are optional and may or may not be present depending upon whether or 
not they affect the activity or action of the listed elements. 

Further embodiments will be apparent from the following Detailed Description 
and from the claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 A and IB are flow schematics showing the manipulations used to 
convert pT0021, an arsenite inducible vector containing the luciferase gene, into 
pTHA or pTM, two ars inducible vectors. Vector pTHA contains BamH 1, Sal I, and 
Hind III cloning sites and a downstream HA epitope tag. Vector pTM contains Bam 
HI and Hind III cloning sites and no HA epitope tag. 

FIGURE 2 is a schematic representation of the cloning steps involved to place 
theDNA segments ofanyofORFs 17/ 19/43/ 102/104/182 or other sequences into 
pTHA to a&scss inhibitory potential. For subcloning into pTM or pT0021, Individual 
ORFs were amplified by the PCR using oligonucleotides targeting the ATG and stop 
codons of the ORFs. Using this strategy, Bam HI and Hind III sites were positioned 
immediately upstream or downstream, respectively of the start and stop codons of 
each ORF. Following digestion with Bam HI and Hind III, the PCR fragments were. ■» * 
subcloned into the same sites of pT0021 or pTM. Clones were verified by PCKand 
direct sequencing. 
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FIGURE 3 shows a schematic representation of the functional assays used to 
characterize the bactericidal and bacteriostatic potential of all predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Fig. 3 A) Functional assay on semi-solid 
support media. Fig. 3B) Functional assay in liquid culture. 

FIGURE 4A, B, and C is a bar graph showing the results of a screen in liquid 
media to assess bacteriostatic or bactericidal activity of 93 predicted ORFs (>33 
amino acids) encoded by bacteriophage 77. Growth inhibition assays were performed 
as detailed in the Detailed Description. The relative growth of Staphylococcus aureus 
transformants harboring a given bacteriophage 77 ORF (identified on the bottom of 
the graph), in the absence or presence of arsenite, is plotted relative to growth of a 
Staphylococcus aureus transformant containing ORF 5, a non-toxic bacteriophage 77 
ORF (which is set at 1 00%). Each bar represents the average obtained from three . 
Staph A transformants grown in duplicate. Bacteriophage 77 ORFs showing 
significant growth inhibition consist of ORFs 17, 19, 102, 104, and 182. 

FIGURE 5 shows a block diagram of major components of a general purpose 
computer. 

FIGURE 6 shows an ORF map for Streptococcus pneumoniae bacteriophage 
Dp-1 showing the ORF identifiers, genomic locations, and orientations of the 85 
identified ORFs that were found to have ribosomal binding sites and thus are expected 
to be expressed. 

FIGURE 7 shows a schematic representation of the arsenite-inducible 
expression system present in a shuttle vector designed to express individual 
Streptococcus bacteriophage Dp-1 ORFs in Streptococcus. Various modifications can 
be readily made to such a vector, or other vectors can be readily constructed to 
provide inducible expression of ORFs in a particular host bacterium using well-known 
techniques. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The invention may be more clearly understood from the following description. 
The tables will first be briefly described. 

Table 1 is a listing of a large number of available bacteriophage that can be 
readily obtained and used in the present invention. 

Table 2 shows the complete nucleotide sequence of the genome of 
Staphylococcus aureus bacteriophage 77. 

table 3 shows a list of all the ORFs from Bacteriophage 77 that were screened 
in the functional assay to identify those with anti-microbiaJ activity. 

Table 4 shows the predicted nucleotide sequence, predicted amino acid 
sequence, and physiochemical parameters of ORF 17/ 19/43/ 102/104/ 182]. These 
include the primary amino acid sequence of the predicted protein, the average 
molecular weight, amino acid composition, theoretical pi, hydrophobicity map, and 
predicted secondary structure map. 

Table 5 shows homology search results. BLAST analysis was performed with 
ORFs 17/ 19/43/ 102/ 104/ 182 against NCBI non-redundant nucleotide and 
Swissprot databases. The results of this search indicate that: I) ORF 1 7 has no 
significant homology to any gene in the NCBI non-NCBI non-redundant nucleotide 
database, II) ORF 19 has significant homology to one gene in the NCBI non- 
redundant nucleotide database - the gene encoding ORF 59 of bacteriophage phi PVL, 
III) ORF 43 has significant homology to one gene in the NCBI non-redundant 
nucleotide database - the gene encoding ORF 39 of phi PVL, IV) ORF 102 has 
significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 38 of phi PVL, V) ORF 1 04 has no significant homology to 
any gene in the NCBI non-redundant nucleotide database, VI) ORF 182 has 
significant homology to one gene in the NCBI non-redundant nucleotide database - 
the gene encoding ORF 39 of phi PVL. 

Table 6 is a table from Alberts et al., MOLECULAR BIOLOGY OF THE 
CELL 3 rd ed., showing the redundancy of the "universal*' genetic code. 

Table 7 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 3A. 
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Table 8 is a listing of the ORFs identified in Staphylococcus aureus 
bacteriophage 3A. 

Table 9 shows the complete nucleotide sequence of Staphylococcus aureus 
bacteriophage 96. 

Table 1 0 is a listing of the ORFs identified in Staphylococcus aureus 
bacteriophage 96. 

Table 1 1 is a listing of sequences deposited in the NCBI public database 
(GeneBank) for bacteriophage listed in Table 1 . 

Table 12 is a listing of phage which encode a known lysis function , including 
the identified lysis gene. 

Table 13 is a listing of bacteriophage which encode holin genes, where holin 
genes encode proteins which form pores and eventually enable other enzymes to kill 
the host bacterium. 

Table 14 is a listing of bacteriophage which encode kil genes. 

Table 1 5 is a list of Staphylococcus aureus sequences identified by accession 
number which may include sequences from genes coding for target sequences for the 
phage 77-encoded antimicrobial proteins or peptides. The sequences were obtained 
by searching GenBank for listings. 

Table 1 6 shows the nucleotide sequence of the genome of Staphylococcus 

aureus phage 44 AHJD. 

Table 1 7 lists and shows the sequence position of the 73 ORFs predicted to be 
encoded by Staphylococcus aureus bacteriophage 44 AHJD that are greater than 33 
amino acids. 

Table 18 shows the ORF sequences and putative amino acid sequences for the 
Staphylococcus aureus bacteriophage 44 AHJD ORFs greater than 33 amino acids. 

Table 19 shows the similarities in sequence identified between predicted 
Staphylococcus aureus bacteriophage 44 AHJD ORFs and sequences present in public 
databases. 

Table 20 shows the homology alignments between predicted Staphylococcus 
aureus bacteriophage 44AHJD ORFs and the corresponding protein sequences present 
in public sequence databases. 

Table 21 shows the complete nucleotide sequence of the genome of 
Enterococcus bacteriophage 182. ~ 

Table 22 lists and shows the sequence position of the 80 ORFs identified in 
bacteriophage 182 and that arc greater than 33 amino acids. 
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Tabic 23 shows the nucleotide and predicted amino acid sequence of all 80 
ORFs identified in bacteriophage 182. 

Table 24 shows the similarities identified to date in sequence between 
Enterococcus phage 182 ORFs greater than 33 amino acids and sequences present in 
public sequence databases. 

Table 25 shows the predicted amino acid sequence as well as the predicted 
secondary structures map for two Enterococcus bacteriophage 1 82 ORFs. 

Table 26 shows the homology alignments between predicted Enterococcus 
bacteriophage 1 82 ORFs and the corresponding protein sequences present in public 
sequence databases. 

Table 27 list Enterococcus sequences listed in GenBank providing possible 
Enterococcal target sequences for inhibitory Enterococcus bacteriophage 1 82 ORFs 
and otheT compounds with antibacterial activity. 

Table 28 shows the complete nucleotide sequence of the genome of 
Streptococcus bacteriophage Dp- 1 . 

Table 29 lists and shows sequence position of the 273 ORFs identified in 
Pneumococcal bacteriophage Dp-1 that are greater than 33 amino acids, 85 of which 
are predicted to be expressed in Dp-1 as having a ribosomal binding site. That set of 
85 ORFs is shown in the attached drawings. 

Table 30 shows the nucleotide and predicted amino acid sequence of all 273 
ORFs identified in bacteriophage Dp-1 that are identified as being expressed. 

Table 3 1 shows the similarities identified in sequence between Streptococcus 
phage Dp-1 ORFs greater than 33 amino acids and sequences present in public 
sequence databases. 

Table 32 shows the 4731 bp sequence of Dp-1 published by Sheehan et ah, 

1997). 

Table 33 lists Streptococcus pneumoniae sequences listed in GenBank 
providing possible target sequences for inhibitory Streptococcus pneumoniae 
bacteriophage Dp-1 ORFs and other compounds with antibacterial activity 

PacKground; 

As indicated above, the present invention is concerned, in part, with the use of 
bacteriophage coding sequences and the encoded polypeptides or RNA transcripts to _ 
identify bacterial targets for potential new antibacterial agents. Thus, the invention 
concerns the selection of relevant bacteria. Particularly relevant bacteria are those 
which are pathogens of a complex organism such as an animal, e.g., mammals, 
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reptiles, and birds, and plants. Examples include Stapylococcus aureus, Enterococcus 
species, and Streptococcus pneumoniae. However, the invention can be applied to 
any bacterium (whether pathogenic or not) for which bacteriophage are available or 
which are found to have cellular components closely homologous to components 
targeted by phage of another bacterium. 

Thus, the invention also concerns the bacteriophage which can infect a 
selected bacterium. Identification of ORFs or products from the phage which inhibit 
the host bacterium both provides an inhibitor compound and allows identification of 
the bacterial target affected by the phage-encoded inhibitor. Such targets are thus 
identified as potential targets for development of other antibacterial agents or 
inhibitors and the use of those targets to inhibit those bacteria. As indicated above, 
even if such a target is not initially identified in a particular bacterium, such a target 
can still be identified if a homologous target is identified in another bacterium. 
Usually, but not necessarily, such another bacterium would be a genetically closely 
related bacterium. Indeed, in some cases, a phage-encoded inhibitor can also inhibit 
such a homologous bacterial cellular component. 

The demonstration that bacteriophage have adapted to inhibiting a host 
bacterium by acting on a particular cellular component or target provides a strong 
indication that that component is an appropriate target for developing and using 
antibacterial agents, e.g., in therapeutic treatments. Thus, the present invention 
provides additional guidance over mere identification of bacterial essential genes, as 
the present invention also provides an indication of accessability of the target to an 
inhibitor, and an indication that the target is sufficiently stable over lime (e.g., not 
subject to high rates of mutation) as phage acting on that target were able to develop 
and persist Thus, the present invention identifies a subset of essential cellular 
components which arc particularly likely to be appropriate targets for development of 
antibacterial agents. 

The invention also, therefore, concerns the development or identification of 
inhibitors of bacteria, in addition to the phage-encoded inhibitory proteins (or RNA 
transcripts), which are active on the targets of bacteriophage-encoded inhibitors. As 
described herein, such inhibitors can be of a variety of different types, but are 
preferably small molecules. 

The following description provides preferred methods for use in the various 
aspects of the invention. However, as those skilled in the an will readily recognize, 
other approaches can be used to obtain and process relevant information. Thus4hc * 
invention is not limited to the specifically described methods. In addition, the 
following description provides a set of steps in a particular order. That series of steps 
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describes the overall development involved in the present invention. However, it is 
clear that individual steps or portions of steps may be usefully practiced separately, 
and, further, that certain steps may be performed in a different order or even bypassed 
if appropriate information is already available or is provided by other sources or 
methods. 

Selecting and Growing Phage, and Isolating p^A 

Conceptually, the first step involves selecting bacterial hosts of interest 
Preferably, but not necessarily, such hosts will be pathogens of clinical importance. 
Alternatively, because bacteria all share certain fundamental metabolic and structural 
features, these features can be targeted for study in one strain, for example a 
nonpathogenic one, and extrapolated to similarly succeed in pathogenic ones. 
Nonpathogenic strains may also exhibit initial advantages in being not only less 
dangerous, but also, for example, in having better growth and culturing characteristics 
and/or better developed molecular biology techniques and reagents. Consequently, 
advantageously the invention provides the ability target virtually any bacteria, but 
preferably pathogenic bacteria, with antimicrobial compounds designed and/or 
developed using bacteriophage inhibitory proteins and peptides from phage with non- 
pathogenic and/or pathogenic hosts. 

We have selected Staphylococcus aureus, Streptococcus pneumoniae, various 
Enterococci, and Pseudomonas aeruginosa as initial exemplary pathogens. These 
bacteria are a major cause of morbidity and mortality in hospital-based infections, and 
the appearance of antibiotics resistance in all three organisms makes it increasingly 
difficult to treat benign infections involving these organisms. Such infections can 
include, for example, otitis media, sinusitis, and skin, and airway infections (Neu, 
H.C. (1992). Science 257, 1064-1073). However, the approach described below is 
clearly applicable to any human bacterial pathogens including but not restricted to 
Mycobacterium tuberculosis, Nesseria gonorrhoeae, Haemophilus influenza, 
Acinobacter, Escherichia coli. Shigella dysenteric, Streptococcus pyogenes, 
Helicobacter pylori, and Mycoplasma species. This invention can also be applied to 
the discovery of anti-bacterial compounds directed against pathogens of animals other 
than humans, for example, sheep, cattle, swine, dogs, cats, birds, and reptiles. 
Similarly, the invention is not limited to animals, but also applies to plants and plant 
pathogens. 

In general, the bacteria are grown according to standard methodologies - 
employed in the art, including solid, semi-solid or liquid culturing, which procedures 
can be found in or extrapolated from standard sources such as Maloy, S.R., Stewart, 
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VJ., and Taylor, R.K. Genetic Analysis of Pathogenic Bnrterip (1996) Cold Spring 
Harbor Laboratory Press, orManiatis, T. et al. (1989) Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor University Press, Cold Spring, N.Y.; or 
Ausubel, F.M. et al. (1994) Current Protocols in Molecular Biology. John Wiley & 
Sons, Secaucus, N.J. Culture conditions are selected which are adapted to the 
particular bacterium generally using culture conditions known in the art as 
appropriate, or adaptations of those conditions. 

Nucleic acids within these bacteria can be routinely extracted through common 
procedures such as described in the above-referenced manuals and as generally known 
to those skilled in the art. Those nucleic acid stocks can then be used to practice the 
other inventive aspects described below. 

Selection and Growth of Bacteriopha ge, and Isolation of DNA 

The second step involves assembling a group of bacteriophages (phage 
collection) for one or more of the targeted bacterial hosts. While the invention can be 
utilized with a single bacteriophage for a pathogen or other bacterium, it is preferable 
to utilize a plurality of phage for each bacterium, as comparisons between a plurality 
of such phage provides useful additional information. Non-limiting examples of 
phage and sources for some of the above-mentioned pathogenic bacteria are found in 
Table 1. The criteria used to select such phages is that they are infectious for the 
microbe targeted, and replicate in, lyse, or otherwise inhibit growth of the bacterium 
in a measurable fashion. These phages can be very different from one another 
(representing different families), as judged by criteria such as morphology (head, tail, 
plate, etc.), and similarity of genome nucleotide sequence (cross-hybridization). Since 
such diverse bacteriophages are expected to block bacterial host metabolism and 
ultimately inhibit by a variety of mechanisms, their combined study will lead to the 
identification of different mechanisms by which the phages independently inhibit 
bacterial targets. Examples include degradation of host DNA (Parson K.A., and 
Snustad,D.P.(l975).y. Virol. 15, 221-444) and inhibition of host RNA transcription 
(Severinova, E., Severinov, K. and Darst, S.A. (1998/ JMol Biol 279, 9-18). This, 
in turn, yields novel information on phage proteins that can inhibit the targeted 
microbe. As explained below, this I) forms the basis of novel drug discovery efforts 
based on knowledge of the primary amino acid sequence of the phage inhibitor 
protein (e.g. t peptide fragments or peptidomimetics) and/or 2) leads to the 
identification of bacterial biochemical pathways, the proteins of which are essential or 
significant for survival of the targeted microbe, and which enzymatic steps or 
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chemical reactions can be targeted by classical drug discovery methods using 
molecular inhibitors, for example, small molecule inhibitors. 

Bacteriophage are generally either of two types, lytic or filamentous, meaning 
they either outright destroy their host and seek out new hosts after replication, or else 
continuously propogate and extrude progeny phage from the same host without 
destroying it. Regardless of the phage life cycle and type, preferred embodiments 
incorporate phage which impede cell growth in measurable fashion and preferably 
stop cell growth. To this end, lytic phage are preferred, although certain nonlytic 
species may also suffice, e.g. y if sufficiently bacteriostatic. 

Various procedures that are commonly understood by those of skill in the art 
can be routinely employed to grow, isolate, and purify phage. Such procedures are 
exemplified by those found in such common laboratory aids such as Maloy, S.R., 
Stewart, V.J., and Taylor, R.K. Genetic Analysis of Pathogenic Bacteria (1996) Cold 
Spring Harbor Laboratory Press; Maniatis, T. et al. (19891 Molecular Cloning: A 
Laboratory Manual. Cold Spring Harbor University Press, Cold Spring, N. Y.; and 
Ausubel, F.M. et al. (eds.) (1994) Current Protocols in Molecular Biology . John 
Wiley & Sons, Secaucus, N.J. The techniques generally involve the culruring of 
infected bacterial cells that are lysed naturally and/or chemically assisted, for 
example, by the use of an organic solvent such as chloroform that destroys the host 
cells thereby liberating the phage within. Following this, the cellular debris is 
centrifuged away from the supernatant containing the phage particles, and the phage 
then subsequently and selectively precipitated out of the supernatant using various 
methods usually employing the use of alcohols and/or other chemical compounds 
such as polyethylene glycol (PEG). The resulting phage can be further purified using 
various density gradient/centrirugation methodologies. The resulting phage are then 
chemically lysed, thereby releasing their nucleic acids that can be conveniently 
precipitated out of the supernatant to yield a viral nucleic acid supply of the phage of 
interest. 

Exemplary bacteriophage are indicated in Table 1 , along with sources where 
those phage may be obtained. 

Exemplary bacteria include the reference bacteria for the identified 
bacteriophage, available from the same sources. 

Charactering, Pacterjpphage Genomes for QRfg 

The third step involves systematically characterizing the genetic information 
contained in the phage genome. Within this genetic information is the sequence of all 
RNAs and proteins encoded by the phage, including those that are essential or 
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instrumental in inhibiting their host This characterization is preferably done in a 
systematic fashion. For example, this can be done by first isolating high molecular 
weight genomic DNA from the phage using standard bacterial lysis methods, followed 
by phage purification using density gradient ultracentrifugation, and extraction of 
nucleic acid from the purified phage preparation. The high molecular weight DNA is 
then analyzed to determine its size and to evaluate a proper strategy for its sequencing. 
The DNA is broken down into smaller size fragments by sonication or partial 
digestion with frequently cutting restriction enzymes such as Sau3 A to yield 
predominantly 1 to 2 kilobase length DNA, which DNA can then be resolved by gel 
electrophoresis followed by extraction from the gel. 

The ends of the fragments arc cnzymatically treated to render them suitable for 
cloning and the pools of fragments are cloned in a bacterial plasmid to generate a 
library of the phage genome. Several hundred of these random DNA fragments 
contained in the plasmid vector are isolated as clones after introduction into an 
appropriate bacterium, usually Escherichia coli. They are then individually expanded 
in culture and the DNA from each individual clone is purified. The nucleotide 
sequences of the inserts of these clones are determined by standard automated or 
manual methods, using oligonucleotide primers located on either side of the cloning 
site to direct polymerase mediated sequencing (e.g., the Sanger sequencing method or 
a modification of that method). Other sequencing methods can also be used. 

The sequence of individual clones is then deposited in a computer, and 
specific software programs (for example, Sequencher™, Gene Codes Corp.) are used 
to look for overlap between the various sequences, resulting in ordering of contig 
sequences and ultimately providing the complete sequence of the entire bacteriophage 
genome (one such example is given in Table 2 for Staphylococcus aureus 
bacteriophage 77; others are also provided herein). This complete nucleotide 
sequence is preferably determined with a redundancy of at least 3- to 5-fold (number 
of independent sequencing events covering the same region) in order to minimize 
sequencing errors. 

Preferably, the bacterial strain used as a phage host should not possess any 
other innate plasmids, transposons, or other phage or incompatible sequences that 
would complicate or otherwise make the various manipulations and analyses more 
difficult. 

Commercially available computer software programs are used to translate the 
nucleotide sequence of the phage to identify all protein sequences encoded by the ~ : " * 
phage (hereafter called open reading frames or ORFs). (Customized software can 
clearly also be used.) As phages are known to transcribe their genome into RNA from 
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both strands, in both directions, and sometimes in more than one frame for the same 
sequence, this exercise is done for both strands and in all six possible reading frames. 
As evolutionary constraints have forced the phage to conserve all of its vital protein 
sequences in as small a genome as possible, it is straightforward to identify all the 
proteins encoded by the phage by simple examination of the 6 translation frames of 
the genome. Once these ORFs are identified, they are cataloged into a phage 
proteome database (Table 3 lists ORFs identified from phage 77; ORF lists are also 
provided for other exemplary phage). This analysis is preferably performed for each 
phage under study. The process of ORF identification can be varied depending on the 
desired results. For example, the minimum length for the putative encoded 
polypeptide can be varied, and/or putative coding regions that have an associated 
Shine-Dalgarno sequence can be selected. In the case of phage 77 ORFs, such 
parameter adjustment was performed and resulted in the identification of ORFs as 
listed herein. Different parameters had resulted in the identification of the ORFs 
listed in the preceding U.S. Provisional Application 60/1 10,992, filed December 3, 
1 998, which is hereby incorporated by reference in its entirety. 

Exemplary phage 77 ORFs identified in that provisional application and as 
identified herein are shown in the following table: 



ORF ID 

from 

60/110,992 


Genomic 
position 


a,a. 
size 


Start 
co don 


ORF ID 

from 
241/190 


Genomic 
position 


a.a. 
size 


Start 
co don 


77ORF0I6 


2369-24024 


251 


TTG 


77ORF017 


23269-23982 


237 


ATG 


77ORF019 


39845-40501 


218 


ATA 


77ORF019 


39851-40501 


216 


ATG 


77ORF050 


29268-29564 


98 


ATG 


770RFI82 


29268-29564 


98 


ATG 


77ORF050 


29268-29564 


98 


ATG 


77ORF043 


29304-29564 


86 


ATG 


77ORF067 


34312-34551 


79 


CTG 


77ORF104 


34393-34551 


52 


ATG 


770RFI46 


29051-29212 


53 


ATG 


77ORF102 


29051-29212 


53 


ATG 



Identifying and Characte rizing Inhibitory Phage ORFs 

The fourth step entails identifying the phage protein or proteins or RNA 
transcripts that have the ability to inhibit their bacterial hosts. This can be 
accomplished, for example, by either or both of two non-mutually exclusive methods. 
The first method makes use of bioinformatics. Over the past few years, a large amount 
of nucleotide sequence information and corresponding translated products have 
become available through large genome sequencing projects for a variety of 
organisms including mammals, insects, plants, unicellular eukaryotes (yeast ana 1 ~ :r ~ 
fungi), as well as several bacterial genomes such as E. coli f Mycobacterium 
tuberculosis, Bacillus subtilis, Staphylococcus aureus and many others. Such 
sequences have been deposited in public databases (for example, non-redundant 
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sequence database at GenBank and SwissProt protein sequence database) 
(http:/Avww.ncbi.nlm.nih.gov)) and can be freely accessed to compare any specific 
query sequence to those present in such databases. For example, GenBank contains 
over 1 .6 billion nucleotides corresponding to 2.3 million sequence records. Several 
computer programs and servers (e.g., TBLASTN) have been created to allow the rapid 
identification of homology between any given sequence from one organism to that of 
another present in such databases, and such programs are public and available free of 
charge. 

In addition, it has been well established that basic biochemical pathways can 
be conserved in very distant organisms (for example bacteria and man), and that the 
proteins performing the various enzymatic steps in these pathways are themselves 
conserved at the amino acid sequence level. Thus, proteins performing similar 
functions {e.g. DNA repair, RNA transcription, RNA translation) have frequently 
preserved key structural signatures, identifiable by similarities across regions of 
proteins (domains and motifs). The antimicrobials of the present invention will 
preferably target features and targets that are highly characteristic or conserved in 
microbes, and not higher organisms. 

Most genomes encode individual proteins or groups of proteins that can be 
assembled into protein families that have been evolutionarily conserved. Therefore, 
similarity between a new query sequence and that of a member of a protein family 
(reference sequences from public databases) can immediately suggest a biochemical 
function for the novel query sequence, which in our case is a phage ORF. 

The sequence homology between individual members of evolutionarily distant 
members of a protein family is usually not randomly distributed along the entire 
length of the sequence but is often clustered into "motifs" and "domains". These 
correspond to key three-dimensional folds that form key catalytic and/or regulatory 
structures that perform key biochemical function(s) for the group of proteins. 
Commercially available computer software programs can identify such motifs in a 
new query sequence, again providing functional information for the query sequence. 
Such structural and functional motifs have also been derived from the combined 
analysis of primary sequence databases (protein sequences) and protein structure 
databases (X-ray crystallography, nuclear magnetic resonance) using so-called 
'■threading" methods (Rost B,l and Sander C. (\996)Ann. Rev. Biophy. Biomol. 
Struct. 25, 113-136). 

Such motifs and folds arc themselves deposited in public databases which can 
be directly accessed (for example, SwissProt database; 3D-ALJ at EMBL, Heidelberg; 
PROSITE). This basic exercise leads to a structural homology map in which each of 
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the phage ORFs has been probed for such similarities, and where initial structural and 
functional hits are identified (selected examples of sequence homologies detected 
between individual ORFs from the genome of Staphylococcus aureus bacteriophage 
77 and sequences deposited in public databases are shown in Table 5 for ORFs 
17/19/43/102/104/182). 

This analysis can point out phage proteins with similarity to proteins from 
other phages (such as those for E. coif) playing an important role in the basic 
biochemical pathways of the phage (such as DNA replication, RNA transcription, 
tRNAs, coat protein and assembly). Selected examples of such proteins include 
integrase and capsid protein. Therefore, this analysis enables identification and 
elimination of non-essential ORFs as candidates for an inhibitor function, as well as 
the identification of (potentially) useful ones. 

In addition, this analysis can point out specific ORFs as possible inhibitor 
ORFs. For example these ORFs may encode proteins or enzymes that alter bacterial 
cell structure, metabolism or physiology, and ultimately viability. Examples of such 
proteins present in the genome of Staphylococcus aureus bacteriophage 77 include 
orfl4 (deoxyuridine triphosphatase from bacteriophage T5), and orfl5 (sialidase). 
(These ORF identifications are as listed in provisional application 60/1 1 0,992.) Other 
examples include ORFs 9 and 12 of S. aureus phage 44 AHJD, which encode the 
putative lysis functions found in many bacteriophages - a "holin" and an "amidase". 

In addition, it is well known that bacterial and eukaryotic viruses can usurp 
pathways from their host in order to use them to their advantage in blocking host 
cellular pathways upon infection. The phage can achieve this by 1) directly producing 
an inhibitor of a key host pathway (e.g. T7 gene 0.5 and 2), 2) directly producing a 
novel activity (e.g. T4 DNA polymerase), and 3) altering concentrations of cell 
components by producing similar functions (e.g. T4 transfer RNAs). The 
identification of sequence similarity between phage ORFs and bacterial host genome 
sequences will be highly indicative of such a mechanism. (Selected examples of such 
homologies are listed in Figure 4 of the provisional application 60/1 10,992 and 
include orf4 (homologous to autolysin), orf20 (hypothetical protein from 
Staphyloccus aureus) and orf29 (hypothetical protein from Staphyloccus aureus.)) 
These ORFs can be analyzed by a standard biochemical approach to directly test their 
inhibitor functions (e.g., as described below). 

Alternatively, a homology search may reveal that a given phage ORF is related 
to a protein present in the databases having an activity known to be inhibitory, {e.g. 1 * 
inhibitor of host RNA polymerase by E. colt bacteriophage T7. Such a finding would 
implicate the phage ORF product in a related activity. This will also suggest that a 
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new antimicrobial could be derived by a mimetic approach (e.g., pepndomimetic) 
imitating this function or by a small molecule inhibitor to the bacterial target of the 
phage ORF, or any steps in the relevant host metabolic pathway, e.g. f high throughput 
screening of small molecule libraries. Selected examples of such similarity between 
5 ORFs of Staphyloccus aureus bacteriophage 77 and proteins with inhibitor functions 
for bacterial hosts are listed in Figure 4 of the provisional application 60/1 10,992. 
These include orf9 (similar to bacteriophage PI kilA function), and orf4 (autolysin of 
Staphylococcus aureus, amidase enzymatic activity). 

A reason for the biochemical study of individual ORFs for inhibitor function is 
1 0 that their expression or overexprcssion will block cellular pathways of the host, 
ultimately leading to arrest and/or inhibition of host metabolism. In addition, such 
ORFs can alter host metabolism in different ways, including modification of 
pathogenicity. Therefore, individual ORFs identified above are expressed, preferably 
overexpressed, in the host and the effect of this expression or overexpression on host 
1 5 metabolism and viability is measured. This approach can be systematically applied to 
every ORF of the phage, if necessary, and does not rely on the absolute identification 
of candidate ORFs by bioinformatics. Individual ORFs are resynthesized from the 
phage genomic DNA, e.g. y by the polymerase chain reaction (PCR), preferably using 
oligonucleotide primers flanking the ORF on either side. These single ORFs are 
preferably engineered so that they contain appropriate cloning sites at their extremities 
to allow their introduction into a new bacterial expression plasmid, allowing 
propagation in a standard bacterial host such as E. coli, but containing the necessary 
information for plasmid replication in the target microbe such as S. aureus (hereafter 
referred to as shuttle vector). Shuttle vectors and their use are well known in the art. 

Such shuttle vectors preferably also contain regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode an 
inhibitor function that will eliminate the host, it is beneficial that it not be expressed 
prior to testing for activity. Thus, screening for such sequences when expressed in a 
constitutive fashion is less likely to be successful when the inhibitor is lethal. In the 
exemplary inducible system presented in Figure I A, IB, 2, and 7, regulatory 
sequences from the ars operon of S. aureus are used to direct individual ORF 
expression in S. aureus (or other bacteria in which the ars system is functional). The 
ars operon encodes a series of proteins which normally mediate the extrusion of 
arsenite and other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying _ ~ : " 
mechanism is normally silent and only induced when arsenite-related compounds are 
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present (Tauriainen, S. et al. (1997) App. Env. Microbe Vol. 63, No. 1 1 , p. 4456- 
4461.) 

Therefore, individual phage ORFs can be expressed in S. aureus in an 
inducible fashion by adding to the culture medium non-toxic arsenite concentrations 
during the growth of individual S. aureus clones expressing such individual phage 
ORFs. Toxicity of the phage inhibitor ORF for the host is monitored by reduction or 
arrest of growth under induction conditions, as measured by optical density in liquid 
culture or after plating the induced cultures on solid medium. Subsequently, 
interference of the phage ORF with the host biochemical pathways ultimately leading 
to reduced or arrested host metabolism can be measured by pulse-chase experiments 
using radiolabeled precursors of either DNA replication, RNA transcription, or protein 
synthesis. Similar constructs can be made and used for other bacteria using well- 
known techniques. 

Those skilled in the art are familiar with a variety of other inducible systems 
which can also be used for the controlled expression of phage ORFs, including, for 
example, lactose (see e.g., Stratagene's LacSwitch™II system; La Jolla, CA) and 
tetracycline-based systems (see, e.g. Clontech's Tet On/Tet Of!™ system; Palo Alto, 
CA). The arsenite-inducible system described is further depicted in Figures 1 , 2 and 7. 

The selection or construction of shuttle vectors and the selection and use of 
inducible systems are well known and thus other shuttle vectors appropriate for other 
bacteria can be readily provided by those skilled in the art, e.g., for use in other 
bacterial species. 

Standard methodologies for expressing proteins from constructs, and isolating 
and manipulating those proteins, for example in cross-linking and affinity 
chromatography studies, may be found in various commonly available and known 
laboratory manuals. See, e.g., Current Protocols in Protein Science. John Wiley & 
Sons, Secaucus, N.J., and Maniatis, T. et al. (1989) Molecular Cloning: A Laboratory 
Manual. Cold Spring Harbor University Press, Cold Spring, N.Y. 

It has been found that certain phage or other viruses inhibit host cells, at least 
in part, by producing an antisense RNA which binds to and inhibits translation from a 
bacterial RNA seqeunce. Thus, in the case of potentially inhibitor RNA transcripts 
encoded by the phage genome, a strong indicator of a possible inhibitory function is 
provided by the identification of phage sequence which is the identical to or fully 
complementary (or with only a small percentage of mismatch, e.g., <10%, preferably 
less than 5%, most preferably less than 3%, to a bacterial sequence. This approaches ~ 
convenient in the case of bacteria that have been essentially completely sequenced, as 
the comparison can be performed by computer using public database information. 
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The inhibitory effect of the transcript can be confirmed using expression of the 
phage sequence in a host bacterium. If needed, such inhibitory can also be tested by 
transfecting the cells with a vector that will transcribe the phage sequence to form 
RNA in such manner that the RNA produced will not be translated into a polypeptide. 
5 Inhibition under such conditions provides a strong indication that the inhibition is due 
to the transcript rather than to an encoded polypeptide. . 

In an alternative, the expression of an ORF in a host bacterium is found to be 
15 inhibitory, but the inhibition is found to be due to an RNA product of the genomic 

coding region. For antisense inhibition, the sequence of the bacterial target nucleic 
1 0 acid sequence can be identified by inspection of the phage sequence, and the full 
sequence of the relevant coding region for the bacterial product can be found from a 
20 database of the bacterial genomic sequence or can be isolated by standard techniques 

(e.g., a clone in a genomic library can be isolated which contains the full bacterial 
ORF, and then sequenced). 
15 In either case, the identification of a target which is inhibited by an RNA 

25 transcript produced by a phage provides both the possible inhibition of bacteria 

naturally containing the same target nucleic acid sequence, as well as the ability to use 
the target sequence in screening for other types of compounds which will act directly 
on the target nucleic acid sequence or on a polypeptide product expressed or 
20 regulated, at least in part, by the target of the inhibitory phage RNA. 

In some cases it will be found that the target of an inhibitory phage RNA or 
protein has previously been found to be a target of an inhibitory phage RNA or 
protein has previously been found to be a target for an antibacterial agent. In such 
cases, the phage inhibitor can still provide useful information if it is found that the 
35 25 phage-encoded product acts at a different site than the previously identified 

antibacterial agent or inhibitor, i.e., acts at a phage- specific site. For many targets, 
action at a different site provides highly beneficial characteristics and/or information. 
. For example, an alternate site of inhibitor action can at least partially overcome a 
40 resistance mechanism in a bacterium. As an illustration, in many cases, resistance is 

. 30 due, in large part, to altered binding characteristics of the immediate target to the 

antibacterial agent. The altered binding is due to a structural change which prevents 
or destabilizes the binding. However, the structural change is frequently quite local, 
so that compounds which bind at different local sites will b unaffected or affected to a 
much lesser degree. Indeed, in some cases the local sites will be on a different 
35 molecule and so may be completely unaffected by the local structural change creating 
resistance to the original agent(s). An example of resistance due to altered binding is 

50 
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provided by methicillin-resistant Staphylococcus aureus, in which the resistance is 
due to an altered penicillin-binding protein. 

In other cases, a new site of action can have improved accessibility as 
compared to a site acted on by a previously identified agent. This can, for example, 
assist in allowing effective treatment at lower doses, or in allowing access by a larger 
range of types of compounds, potentially allowing identification of more potential 
active agents. 

Another advantage is that the structural characteristics of a different site of 
action will lead to identification and/or development of inhibitors with different 
structures and different pharmacological parameter. This can allow a greater range of 
possibilities when selecting an antibacterial ajgent. 

Yet further, different sites oAen produce different inhibitory characteristics in 
the target organism. This is commonly the case for multi-domain target proteins. 
Thus, inhibition targeting an alternate site can produce more efficacious action, e.g. t 
faster killing, slower development of resistance, lower numbers of surviving cells, and 
different secondary effects (for example, different nutrient utilization). 

Staphylococcus aureus phn ff ft 77 

As indicated above, the present invention is concerned, in part, with the use of 

bacteriophage 77 coding sequences and the encoded polypeptides or RNA transcripts 

to identify bacterial targets for potential new antibacterial agents. 

As described, phage 77 ORFs 17, 19, 43, 102, 104, and 182 have been found 

to have bacteria inhibiting function. Identification of ORFs 17, 19, 43, 102, 104, and 

182 and products from the phage which inhibit the host bacterium both provides an 

inhibitor compound and allows identification of the bacterial target affected by the 

phage-encoded inhibitor. Such a target is thus identified as a potential target for 

development of other antibacterial agents or inhibitors and the use of those targets to 

inhibit those bacteria. As indicated above, even if such a target is not initially 

identified in a particular bacterium, such a target can still be identified if a 

homologous target is identified in another bacterium. Usually, but not necessarily, 

such another bacterium would be a genetically closely related bacterium. Indeed, in 

some cases, an inhibitor encoded by phage 77 ORF 17, 19, 43, 102, 104, or 182 can 

also inhibit such a homologous bacterial celiular component. _;- ~ 

Possible bacterial target sequences are described herein by reference to sequence 
source sites. In preferred embodiments, the sequence encoding the target corresponds 
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to a S. aureus nucleic acid sequence available from numerous sources including S. 
aureus sequences deposited in GenBank, S. aureus sequences found in European 
Patent Application No. 971001 10.7 to Human Genome Sciences, Inc. filed January 7, 
1 997, 5". aureus sequences available from TIGR at 

http:/Avww.tigr.org/tdb/mdb/mdb.htmL and S. aureus sequences available from the 
Oklahoma University S. aureus sequencing project at the following URL: 
http-7Avww.genome.ou.edu/stanh new.html . Such possible targets are particularly 
applicable to S aureus phages 77, 3 A, 96, and 44 AHJD. 

The amino acid sequence of a polypeptide target is readily provided by 
translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a S. aureus coding sequence corresponding to a sequence listed in 
Table 15 herein. The listing in Table 15 describes S. aureus sequences currently listed 
with GenBank. Again, for the sake of brevity, the sequences are described by 
reference to the database accession numbers instead of being written out in full herein. 
In cases where an entry for a coding region is not complete, the complete sequence 
can be readily obtained by routine methods, e.g., by isolating a clone in a phage host 
S. aureus genomic library, and sequencing the clone insert to provide the relevant 
coding region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

Stgphvloccus aureus phase 44 AHJD 

The present invention also can utilize the identification of naturally occuring 

DNA sequence elements within Staphylococcus aureus bacteriophage 44AHJD which 

encode proteins with antimicrobial activity. 

Such identification can utilize bio informatics identification of specific proteins 

(ORFs) utilized by Staphylococcus aureus bacteriophage 44AHJD during the viral life 

cycle, resulting in a slowing or arrest of growth of the bacterial host, or in death, of 

the Staphylococcus aureus host including lysis of the infected bacteria. Thus, some of 

the bacteriophage 44 AHJD DNA sequences encoding these proteins (ORFs) are 

predicted to encode antimicrobial functions. Information derived from these DNA 

sequences and translated ORFs can, in turn, be utilized to develop inhibitory , ~ : ~ *" 

compounds by peptidomimetics that can also function as antimicrobials. In addition, 

the identification of the host bacterial proteins that are targeted and inhibited by the 
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antimicrobial bacteriophage ORFs can themselves provide novel targets for drug 
discovery. 

The methodology described above is used to identify and characterize DNA 
sequences from Staphylococcus sp. bacteriophage 44 AHJD that have antimicrobial 
activity. As described in the Examples, the Staphylococcus aureus propagating strain 
(PS 44A), obtained from the Felix d'Herelle Reference Centre (#HER 1 101), was 
used as a host to propagate its phage 44 AHJD, also obtained from the Felix d'Herelte 
Reference Centre (#HER 101). By sequencing, we found that bacteriophage 44 AHJD 
consists of 16,668 bp (Table 16) predicted to encode 73 ORFs greater than 33 amino 
acids (Tables 1 7 & 18). Computational analysis of the predicted protein products of 
Staphylococcus aureus bacteriophage 44AHJD identified homolgs in public sequence 
databases as listed inTable 19 and 20, along with the accompanying list of related 
proteins. 

From this analysis, it is apparent that 3 genes (ORF 3, 7, and 8) are related to 
structural proteins found in other bacteriophages. These include genes predicted to 
encode a tail protein (ORF 3), an upper collar/connector protein of the phage virion 
(ORF 7), and a lower collar protein (ORF 8). Bioinformatics has also identified one 
gene whose product is likely involved in phage DNA synthesis. One gene (ORF 1) 
shows significant homology to DNA polymerases of a number of bacteriophages, 
bacteria and fungi, and the product of this gene is likely responsible for replicating 
the genetic material of bacteriophage 44 AHJD. ORF 2 encodes a protein with 
homology to the dinC gene of Bacillus subtilis that encodes a protein involved in 
teichoic acid biosynthesis. Teichoic acid is a polyphosphate polymer found in some, 
but not all, Gram positive organisms (and not in Gram negative organisms), where it 
is attached to the peptidoglycan layer. The phage protein may thus be involved in the 
synthesis of this material for incorporation into the cell wall, allowing enhanced lysis 
by the phage lysis enzymes or, as many enzymes can function in "reverse reactions", 
may be involved in its degradation allowing for penetration of the peptidoglycan and 
phage genome entry into the cell following adsorption. The similarity between 
Staphylococcus aureus bacteriophage 44AHJD and £. colt phage T7 indicates that 
they may share similar mechanisms of replication and growth. Both phages belong *to 
the Pododviridae Family of bacteriophages and are members of the "T7-like" Genus 
of this Family (Ackermann and DuBow; Vlth ICTV Report). 
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Two genes, ORF 9 and 12, were identified with the potential to encode 
antimicrobial protein products. The homology alignments are shown in Tables 19 and 
20. The predicted product of ORF 9 is related to a class of genes which encodes 
lysozyme-like functions, enzymes which cleave linkages in the mucopolysaccharide 
5 cell wall structure of a variety of micro-organisms, including that from the 

Staphylococcus aureus bacteriophage Twort ORF 12 of Staphylococcus aureus 
bacteriophage 44AHJD shows homology to a set of lysis proteins from several 
bacteriophages. These lysis proteins are also referred to as holins, and represent 
phage-encoded lysis functions required for transit of the phage murein hydrolases 
1 0 (lysozyme) to the periplasm, where it can digest the cell wall and thus lyse the 
bacterium. 

Thus, in particular embodiments, the present invention provides a nucleic acid 
sequence isolated from Staphylococcus aureus bacteriophage 44AHJD comprising at 
least a portion of one of the genes described above with antimicrobial activity. For 

15 example, ORF 1 encodes a DNA polymerase function. This polymerase may utilize 
host-derived accessory proteins for its activity when replicating the phage template, 
sequestering such proteins from use by the bacterial polymerase, resulting in 
inhibition of DNA replication, cell division, and cell growth. Alternatively, ORF 9 
directly encodes a polypeptide with antimicrobial activity. ORF 9 is predicted to 

20 encode an amidase, a protein known to act as a cell wall degrading enzyme. ORF 12 
likely encodes a holin function required for transit of the phage amidase (gene P 
product) to the periplasm. When this type of gene product from Bacillus phage phi 29 
(gene 14), was cloned in Escherichia coti, cell death ensued (Steiner et al„ 1 993). 
Thus, production of proteins from Bacillus phage phi 29 gene 14 in E. coli resulted in 

25 cell death, whereas production of protein from Bacillus phage phi 29 gene 14 

concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 1 4 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et al., 1 993). 

The present invention also provides the use of the Staphylococcus 

30 bacteriophage 44 AHJD antimicrobial ORFs or ORF products as pharmacological 
agents, either wholly or in part and derivatives, as well as the use of correspondiHg* 
peptidomimetics, developed from amino acid or nucleotide sequence knowledge 
derived from Staphylococcus bacteriophage 44 AHJD killer ORFs. 
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Enterococcus phase 182 

Bacteriophage 182 was obtained from the Felix D'Herelle phage collection 

(Ste. Foy, Quebec) and infects Enterococcus sp. Group D. The genome of 

5 Enterococcus bacteriophage 182 consists of 17,833 bp (Table 21) and is predicted to 

encode 80 ORFs greater than 33 amino acids (Tables 22 and 23). Computational 

analysis of the predicted protein products of Enterococcus bacteriophage 182 was 

performed in order to identify protein products related to those deposited in public 

databases. Bacteriophage 182 protein products which detected sequences with 

10 significant sequence similarity in public databases are listed in Table 24 and 26, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that 5 genes (ORF 001, 004, 007, 009, and 
011) are related to structural proteins of several Bacillus phages - Bacillus 
bacteriophage PZA, phi-29, and B103. These include genes predicted to encode a tail 

15 protein (ORF 001), a head protein (ORF 004), and upper collar protein (ORF 007), a 
lower collar protein (ORF 009), and a pre-neck appendage protein (ORF 01 1). Two 
gene products are predicted to encode genes which direct phage morphogenesis - 
these are ORF 005 and 019. 

Bioinformatics has also identified three genes whose products are likely 

20 involved in phage DN A synthesis. One gene, ORF 002 shows significant homology to 
DNA polymerases of a number of bacteriophages, and the product of this gene is 
likely responsible for replicating the genetic material of bacteriophage 182. ORF 006 
encodes a protein with homology to the encapsidation proteins of several other 
bacteriophages, including Bacillus phage phi-29 (PI 1014), PZA (P07541), and B 103 

25 (X99260) and Streptococcus phage CP-1 (Z47794). These gene products catalyze the 
in vivo and in vitro genome-encapsidation reaction (Garvey et al., 1985). Proteins 
involved in genome packaging have been shown to have additional activities that 
affect biochemical reactions in other phages and their hosts. For example, the coat 
protein of the RNA bacteriophage MS2 interacts with viral RNA to translationally 

30 repress replicase synthesis (Pickett and Peabody, 1993). This protein-RNA interaction 
also plays a role in genome encapsidation, enveloping a single copy of the- viral ' 
genome in a protein shell composed of many molecules of coat protein. In addition, 
the bacteriophage X terminase enzyme can be lethal to £. coli when expressed, 
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suggesting cleavage of packaging sites in the bacterial chromosome. Also present 
within bacteriophage 182 is a gene, ORF 010, that encodes a protein that is related to 
the terminal proteins of Bacillus phage Nf (P06812), Bacillus phage GA-1 (X96987) 
and Bacillus phage B103 (X99260). DNA terminal proteins are linked to the 5' ends 
5 of both strands of the genome and are essential for DNA replication playing a role in 
initial priming of DNA replication. The similarity between Enterococcus 
bacteriophage 182 and Bacillus phages phi-29, PZA, and B103 indicates that they 
may share similar mechanisms of replication and growth. Protein-primed DNA 
replication is a well described phenomenon, and in the phi-29-like phages, the ends of 
1 0 the DNA serve as origins and termini of replication (Gutierrez et a!., 1 986; Yoshikawa 
et ah, 1985). 

There is also a gene (ORF 015) that encodes a protein showing homology to 
an early protein product of Bacillus bacteriophage PZA and the single-strand nucleic 
acid binding protein of bacteriophage B103. 

15 Two genes, ORF 008 and 014, were identified with the potential to encode 

anti-microbial protein products. The homology alignments are shown in Tables 24 & 
26 and biochemical features of the predicted polypeptides shown in Table 25. The 
predicted product of ORF 008 is related to a class of genes which encodes lysozymc- 
like functions, enzymes which cleave linkages in the mucopolysaccharide cell wall 

20 structure of a variety of micro-organisms. ORF 014 of Enterococcus 182 shows 
homology to a set of lysis proteins from Bacillus bacteriophage phi-29, PZA, and 
B103. These lysis proteins are also referred to as holins and represent phage encoded 
lysis functions required for transit of the phage murein hydrolases (lysozyme) to the 
periplasm, where it can digest the outer cell wall and thus lyse the bacterium. 

25 Thus, the present invention provides a nucleic acid sequence obtained from 

Enterococcus bacteriophage 182 comprising at least a portion of a phage 182 ORF, 
preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 002 encodes a 
DNA polymerase function. This polymerase may utilize host-derived accessory 

30 proteins for its activity when replicating the phage template, sequestering such 
proteins from use by the bacterial polymerase, resulting in inhibition of DNA 
replication, cell division, and cell growth. Alternatively, ORFs 008 or 014 directly 
encode polypeptides with anti-microbial activity. ORF 008 is predicted to encode an 
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autolytic lysozyme, a protein known to have anti-microbial activity (Martin et a/., 
1998). ORF 014 likely encodes a holin function required for transit of the phage 
murein hydrolases to the periplasm. When the related product from Bacillus phage phi 
29 (gene 14), was cloned in Escherichia coli, cell death ensued (Steiner et ai, 1993). 
5 Thus, production of proteins from Bacillus phage phi 29 gene 14 in £. coli resulted in 
cell death, whereas production of protein from Bacillus phage phi 29 gene 14 
concomitantly with the phi 29 lysozyme or unrelated murein-degrading enzymes led 
to lysis, suggesting that membrane-bound protein 14 induces a nonspecific lesion in 
the cytoplasmic membrane (Steiner et ai, 1993). 

1 0 The present invention also provides the use of the Enterococcus bacteriophage 

182 anti-microbial ORFs as pharmacological agents, either wholly or in part and 
derivatives, as well as the use of corresponding peptidomimetics, developed from 
amino acid or nucleotide sequence knowledge derived from Enterococcus 
bacteriophage 182 killer ORFs. This can be done where the structure of the 

15 peptidomimetic compound corresponds to the structure of the active portion of a 
product of an ORF. In this analysis, the peptide backbone is transformed into a carbon 
based hydrophobic structure that can retain cytostatic or cytocidal activity for the 
bacterium. This is done by standard medicinal chemistry methods, measuring growth 
inhibition of the various molecules in liquid cultures or on solid medium. These 

20 mimetics also represent lead compounds for the development of novel antibiotics. In 
this context, "corresponds" means that the peptidomimetic compound structure has 
sufficient similarities to the structure of the active portion of a product of one of the 
Enterococcus ORFs listed, that the peptidomimetic will interact with the same 
molecule as the product of the ORF, and preferably will elicit at least one cellular 

25 response in common which relates to the inhibition of the cell by the phage protein. 

To validate the identity of an ORF as a killer ORF, it is preferably expressed 
in the host or other test bacterial organism and the effect of this expression on 
bacterial growth and replication is assessed. Therefore, all individual ORFs identified 
herein, e.g., those identified above, can be expressed, preferably overcxpressed, in a 

30 suitable host bacterium e.g., a host Enterococcus and the effect of this expression or 
ovcrexpression on host metabolism and viability can be measured. 

Individual ORFs can be resynthesized from the phage genomic DNA by the 
polymerase chain reaction (PCR) using oligonucleotide primers flanking the ORF on 
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either side. Those skilled in the art are familiar with the design and synthesis of 
appropriate primer sequences. These single ORFs are preferably engineered so that 
they contain appropriate cloning sites at their extremities to allow their introduction 
into a new bacterial expression plasmid, allowing propagation in a standard bacterial 
5 host such as E. coli. but containing the necessary information for plasmid replication 
in the target microbe, Enterococcus sp. (hereafter referred to as a shuttle vector). 

This shuttle vector also preferably contains regulatory sequences that allow 
inducible expression of the introduced ORF. As the candidate ORF may encode a 
killer function that will eliminate the host, it is highly advantageous that it not be 

10 expressed (or at least not expressed at a substantial level) prior to testing for activity; 
thus screening for such sequences in a constitutive fashion is less likely to be 
successful (lethality). In an example presented in Fig. 7, regulatory sequences from 
the ars operon arc used to direct individual ORF expression in Enterococcus. The ars 
operon encodes a series of proteins which normally mediate the extrusion of arsenite 

15 and several other trivalent oxyanions from the cells when they are exposed to such 
toxic substances in their environment. The operon encoding this detoxifying 
mechanism is normally silent and only induced when arsenite-related compounds are 
present. 

Therefore, individual phage ORFs can be expressed in Enterococcus or other 
20 suitable host in an inducible fashion by adding to the culture medium non-toxic 
arsenite concentrations during the growth of individual Enterococcus (or other host 
cells) clones expressing such individual phage ORFs. Toxicity of the phage killer 
ORF for the host is monitored by reduction or arrest of growth under induction 
conditions, as measured by optical density in liquid culture or after plating the 
25 induced cultures on solid medium. Subsequently, interference of the phage ORF with 
the host biochemical pathways ultimately leading to reducing or arresting host 
metabolism can be measured by pulse chase experiments using radiolabeled 
precursors of either DNA replication, RNA transcription, or protein synthesis. 

Of course, other inducible regulatory sequences (e.g., promoters, operators, 
30 etc.) may be used (e.g., systems using positive induction of expression or systems 
using release of repression). A varieiy of such systems are known to those-skiiled in 
the art and can be utilized in the present invention. 
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Nucleic acid sequences of the present invention can be isolated using a method 
similar to those described herein or other methods known to those skilled in the art. 
In addition, such nucleic acid sequences can be chemically synthesized by well- 
known methods. Having the phage 182 ORFs, e.g., anti-bacterial ORFs of the present 
5 invention, portions thereof, or oligonucleotides derived therefrom as described, other 
anti-microbial sequences from other bacteriophage sources can be identified and 
isolated using methods described here or other methods, including methods utilizing 
nucleic acid hybridization and/or computer-based sequence alignment methods. 

The invention also provides bacteriophage anti-microbial DNA segments from 

1 0 other phages based on nucleic acids and sequences hybridizing to the presently 
identified inhibitory ORF under high stringency conditions or sequences which are 
highly homologous. The bacteriophage anti-microbial DNA segment from 
bacteriophage 182 can be used to identify a related segment from another unrelated 
phage based on stringent conditions of hybridization or on being a homolog based on 

15 nucleic acid and/or amino acid sequence comparisons. As with the phage 182 

inhibitory sequences, such homologous coding sequences and products can be used as 
antimicrobials, to construct active portions or derivatives, to construct 
peptidomimetics, and to identify bacterial targets. 

Enterococcus sequences are listed in Table 27 by accession number, providing 

20 identification of possible targets of Enterococcus phage inhibitory ORF products, e.g., 
from phage 182. 

Streptococcus pneumoniae 

As indicated in the Summary above, the present invention is concerned 

25 with the use of Streptococcus sp. bacteriophage Dp-1 coding sequences and the 

encoded polypeptides or RNA transcripts to identify bacterial targets for potential new 

antibacterial agents. 

Streptococcus pneumoniae is an important cause of community-acquired 
pneumonia and a major cause of otitis media, sinusitis, and meningitis in children and 
30 adults. In Spain and other Mediterranean countries, the majority of £ pneumoniae are 
relatively resistant to penicillin (Klugman, 1990; Fenoll et al., 1991; Jorgenseriet'al., 
1990). These strains also have decreased susceptibility to broad-spectrum 
cephaJoporins, which are frequently used in the empiric treatment of meningitis and 
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other serious invasive bacterial infections. High-level resistance of pneumococci has 
been encountered in Hungary where 70% of children who were colonized with S. 
pneumoniae carried penicillin resistant strains that were also resistant to tetracycline, 
erythromycin, trimethoprim/sulfamethoxazole, and 30% resistant to chloramphenicol 
5 (Neu, 1992). The resistance of pneumococci to macrolides such as erythromycin 
averages 20-25% in France, -20% in Japan, and < 1 0% in Spain (Neu, 1 992). 

The antimicrobial susceptibilities and distribution of serotypes of the 42 
isolates of S. pneumoniae in southern Taiwan from invasive infections have been 
recently determined (Hseuh et aL, 1996). Resistance rates among these isolates were: 
10 erythromycin, 61.9%; clindamycin, 47.6%; chloramphenicol, 19%; and tetracycline, 
73.8%. Resistance to three or more classes of antibiotics was found in 33.3% of the 
isolates. Bacteremic pneumonia and primary bacteremia accounted for 64.3% of the 
infections and mortality was 42.6%. Given the severity of these infections despite 
adequate antibiotic therapy, there is clearly a need for introduction of new therapeutic 
1 5 options to prevent mortality due to invasive S. pneumoniae infections. 

Pneumococcal phages belong to four families and they present a great variety 
in morphology, including lytic and temperate phages (for a review, see Garcia et al,, 
1997). Examples of lytic phages are Cp-1 and Dp-1, whereas examples of temperate 
phages are HB-3, EJ-1, and HB-746. The complete nucleotide sequence and 
20 functional organization of Cp-1 has been reported (Martin et aL, 1996). Cp-1 has a 
19,345 bp double-stranded DNA genome, with a terminal protein covalently linked to 
its 5* ends, that replicates by a protein primed mechanism. The phage contains 29 
ORFs, 23 on one strand and 6 on the opposite. When these predicted proteins were 
compared to sequences compiled in GenBank EMBL databases, to ORFs showed 
25 significant similarity to proteins of bacteriophage 29 that infects B. subtilis (Martin et 
al., 1996). The similar proteins corresponded to those involved in DNA replication 
(terminal protein and DNA polymerase), structural and morphogenic proteins (major 
head, collar, connector, tail, and encapsidation proteins), and proteins involved in lysis 
function (holin and lysozyme). In its strategy of lysis, the holin gene product inserts 
30 itself into the cell membrane, allowing access of the lysozyme to the peptidoglycan^ 
Expression of the Cp-1 holin protein in £. colt results in cell death after 2- hours of 
induction, but did not lead to lysis (Garcia et al., 1997). Cells harboring a plasmid 
construction with holin and lysozyme genes together did lyse after induction and the 
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viability loss was similar to that of the culture expressing holin alone. Cloning of 
these lytic genes in S. pneumoniae showed that both genes had the same effect as in E. 
coli. That is, holin itself did not lyse the culture but the viability loss was noticeable, 
whereas both holin and lysozyme together were capable of lysing M31, an amidase 
5 deleted mutant (Garcia et al., 1 997). 

Recently, a small portion (-4 kbp) of a second S. pneumoniae phage, Dp-1, 
has been sequenced (Sheehan et al., 1997). This portion contains the genes coding for 
the lytic system (Sheehan et al., 1997) and shows a modular organization similar to 
that described for Cp-l. However, in this case, a single chimeric protein appears to be 
1 0 made in which the N-terminal domain is highly similar to that of the murein hydrolase 
coded by a gene found in the phage BK5-T that infects Lactococcus lactis, and the C- 
termina] domain is homologous to hoi ins. Thus, both functions appear to have been 
combined in a novel chimeric protein. 

Bacteriophage Dp-1 was obtained from Dr. P. Garcia (Departamento de 
15 Microbiologia Molecular, Centre de Departamento de Investigaciones Biologicas, 
Consejo Superior de Investigaciones Cientificas, Velazquez, Madrid, Spain). We 
found that Dp-1 has a double-stranded DNA genome of 56,506 bp, predicted to 
encode 85 ORFs greater than 33 amino acids and with upstream Shine-Dalgamo 
motifs for translation initiation (Tables 28 & 30, and Fig. 6). Computational analysis 
20 of the predicted protein products of Streptococcus bacteriophage Dp-1 protein 
products, which detected homologs in public databases, are listed inTable 31, along 
with the accompanying list of related proteins. 

From this analysis, it is apparent that several predicted genes of Dp-1 encode 
polypeptides that are related to structural proteins. ORFs 001, 002, 004, and 030 are 
25 predicted to encode tail proteins, minor structural proteins, and minor capsid proteins 
(Table 31). We also note the identification of several gene products that are likely 
involved in DNA synthesis. These include ORF 3 which encodes DNA polymerase, 
ORF 8 which encodes a SWi/SNF heltcase-relatcd protein, ORF 10 encodes a protein 
showing homology to recA, and ORF 13 encodes a dnaZX-like ORF. 

30 In E. coli t RapA encodes an RNA polymerase (RNAP)-associated protein with - 

ATPase activity and which is a homolog of the eukaryotic SWI/SNF family, a set of 
proteins whose members are involved are involved in transcription activation, 
nucleosome remodeling, and DNA repair. RapA forms a stable complex with RNAP, 
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as if it were a subunit of RNAP and it is possible that the ORF 8 product behaves 
similarly or in a dominant-negative fashion to inhibit the activity of RapA. Mutation 
of the essential E. colt dnaZX results in a block in DNA chain elongation during 
replication (Muki et al. t 1988). The dnaZX gene has only one open reading frame for 
5 a 7l-kDa polypeptide from which the two distinct DNA polymerase III holoenzyme 
subunits, tau (71 kDa) and gamma (47 kDa), are produced. The tau subunit is the 
precursor of the gamma subunit, and the gamma subunit is produced by a -1 
frameshift causing early termination of translation (Tsuchihashi et aL, 1990). These 
proteins show single-strand DNA binding properties that is ATPase (and dATPase) 
10 dependent and arc thought to increasing the processivity of the core DNA polymerase 
enzyme (Lee et al M 1987). 

There are several Dp- 1 ORFs which encode proteins predicted to play a role in 
cellular metabolic pathways. These include polypeptides involved in coenzyme PQQ 
synthesis (ORFs 20, 29, 38). Pyrrolo-quinoline quinone (PQQ) is the non-covalently 
1 5 bound prosthetic group of many quinoproteins catalysing reactions in the periplasm of 
Gram-negative bacteria. Most of these involve the oxidation of alcohols or aldose 
sugars. Interestingly, ORFs 20, 29, and 30 also show homology to the exoenzyme S 
regulon (Frank, 1997). Proteins encoded by the P. aeruginosa exoenzyme S regulon 
may be involved in a contact-mediated translocation mechanism to transfer anti-host 
20 factors directly into eukaryotic cells disrupting eukaryotic signal transduction through 
ADP-ribosylation (Frank, 1997). 

There is also a protein with similarity to GTP cyclohydrolase I (ORF 21) and 
ORF 41 which shows homology to dUTPase (Table 31). GTP cyclohydrolase I is an 
enzyme that catalyzes the first reaction in the pathway for the biosynthesis of the 
25 pteridine, a cofactor of the monooxygenases of the aromatic amino acids. Disruption 
of the homologous gene in Saccharomyces cerevisiae leads to a recessive conditional 
lethality due to folinic acid auxotrophy, that can be complemented with the 
mammalian or bacterial GTP cyclohydrolase I enzymes (Nardese et al., 1996; Mancini 
etal., 1999). 

30 ORF 16 shows high homology to autolysin. This region of the phage sequence 

was previously reported (Sheehan et aL, 1997) and encompasses ~ 4 kbp of our 
sequence. The sequence published by (Sheehan et al., 1997) is shown in Table 32. 

Thus, the present invention provides a nucleic acid sequence obtained from 
Streptococcus bacteriophage Dp-1 comprising at least a portion of a phage Dp-l.QRF; ■■ 

35 preferably an inhibitory ORF, and more preferably at least a portion of one of the 
genes described above with anti-microbial activity. For example, ORF 013 encodes a 
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protein with homology to the gamma subunit of DNA polymerase (dnaX gene), this 
protein may act in a dominant-negative fashion to sequester the host DNA polymerase 
for its own replication, thus inhibiting host DNA replication. The dnaX gene product 
is essential for E. coli replication (Kodaira et al., 1983). 

5 In certain preferred embodiments of the present invention, the bacterial target of 

a bacteriophage inhibitor ORF product, e.g., an inhibitory protein or polypeptide, is 
encoded by a Streptococcus nucleic acid coding sequence from a host bacterium for 
bacteriophage Dp-1. As above, possible target sequences are described herein by 
reference to sequence source sites. The sequence encoding the target preferably 
10 corresponds to a Streptococcus nucleic acid sequence available from The Institute for 
Genomic Research (TIGR), or available from GenBank or other public database. The 
TIGR Streptococcus sequences are publicly available at The Institute for Genomics 
Research at URL: http://vvww.ti pr.orp 

The amino acid sequence of a polypeptide target is readily provided by 

15 translating the corresponding coding region. For the sake of brevity, the sequences 
are not reproduced herein. Also, in preferred embodiments, a target sequence 
corresponds to a Streptococcus pneumoniae coding sequences corresponding to a 
sequence listed in Table 33 herein. Sequences for other Streptococcal species are also 
available from TIGR andVor from GenBank. The listing in Table 33 describes 

20 Streptococcus sequences currently deposited in GenBank. Again, for the sake of 
brevity, the sequences are described by reference to the GenBank entries instead of 
being written out in mil herein. In cases where the TIGR or GenBank entry for a 
coding region is not complete, the complete sequence can be readily obtained by 
routine methods, e.g., by isolating a clone in a phage Dp-1 host Streptococcus sp. 

25 genomic library, and sequencing the clone insert to provide the relevant coding 
region. The boundaries of the coding region can be identified by conventional 
sequence analysis and/or by expression in a bacterium in which the endogenous copy 
of the coding region has been inactivated and using subcloning to identify the 
functional start and stop codons for the coding region. 

30 In the various aspects of this invention involving Dp-1 sequences, preferably the 

sequence is preferably not contained in the sequence described in Sheehan et al., 1997 
(Table 32). 

Validating Identified Inhibitory Phage ORFs _ " 

35 A fifth step involves validating the identified phage inhibitor ORF by 

independent methods, and delineating further possible smaller segments of the ORFs 
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that have inhibitory activity. Several methods exist to validate the role of the 
identified ORF as an inhibitor ORF. 

One example utilizes the creation of a mutant variant of the phage ORF in 
which the candidate ORF carries a partial or complete loss-of- function mutation that 
5 is measurable as compared with the non-mutant ORF. Comparison of the effects of 
expression of the loss of function mutant with the normal ORF provides confirmation 
of the identification of an inhibitor ORF where the loss-of-funcuon mutant provides a 
measurably lower level of inhibition, preferably no inhibition. The loss of function 
may be conditional, e.g., temperature sensitive. 
1 0 Once validation of the inhibitor ORF is achieved, a bi-directional deletion 

analysis can be carried out using the same experimental system to identify the 
minimal polypeptide segment that has inhibitor activity. This may be carried out by a 
variety of means, e.g., by exonuclease or PGR methodologies, and is used to 
determine if a relatively small segment of the ORF (i.e., the product of the ORF) still 

1 5 possesses inhibitory activity when isolated away from its native sequence. If so, a 
portion of the ORF encoding this "active portion" can be used as a template for the 
synthesis of novel anti-microbial agents and further allowing derivation of the peptide 
sequence, e.g., using modified peptides and/or peptidomimetics. 

In creation of certain peptidomimetics, the peptide backbone is transformed 

20 into a carbon-based hydrophobic structure that can retain inhibitor activity against the 
bacterium. This is done by standard medicinal chemistry methods, typically 
monitored by measuring growth inhibition of the various molecules in liquid cultures 
or on solid medium. These mimetics can also represent lead compounds for the 
development of novel antibiotics. 

25 Recently, a major effort has been undertaken by the pharmaceutical industry 

and their biotechnology partners for the sequencing of bacterial pathogen genomes. 
The rationale is that the systematic sequencing of the genome will identify all of the 
bacterial proteins and therefore this proteome will be the target for designing novel 
inhibitor antibiotics. Although systematic, this approach has several major problems. 

30 The first is that analysis of primary amino acid sequences of bacterial proteins does 
not immediately reveal which protein will be essential for viability of the bacterium, 
and target validation is thus a major issue. The second problem is one of redundancy, 
as several biochemical pathways are either structurally duplicated in bacteria 
(different isoforms of the same enzyme), or functionally duplicated by the presence of. . 

35 salvage pathways in the event of a metabolic block in one pathway (different 

nutritional conditions). The third is that even a valid target may not be structurally or 
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functionally amenable to inhibition by small molecules because of inaccessibility 
(sequestration of target). 

Therefore, there is considerable interest within the pharmaceutical and 
biotechnology industry in identifying key targets for drug discovery amongst the mass 
5 of novel targets generated by large-scale genomic sequencing projects. 

On the other hand, and underscoring the instant invention, the phages herein 
described have, over millions of years, evolved specific mechanisms to target such 
key biochemical pathways and proteins. In the few cases where inhibition by phages 
has been elucidated (e.g., see ref. 3), such bacterial targets are invariably rate-limiting 
10. in their respective biochemical pathways, are not redundant, and/or are readily 
accessible for inhibition by the phage (or by another inhibitory compound). 
Therefore, the sixth step of this invention involves identifying the host biochemical 
pathways and proteins that are targeted by the phage inhibitory mechanisms. 

15 Identifying. Validating, and Characterizing Bacterial Host Target Proteins and 
Affected Pathways 

A rationale for this step is that the inhibitor ORF product from the phage 
physically interacts with and/or modifies certain microbial host components to block 
their function. Exemplary approaches which can be used to identify the host bacterial 

20 pathways and proteins that interact with, and preferably also are inhibited by, phage 
ORF product(s) are described below. 

One approach is a genetic screen to determine physiological protein :protein 
interaction, for example, using a yeast two hybrid system. In this assay, the phage 
ORF is fused to the carboxyl terminus of the yeast Gal4 activation domain n (amino 

25 acids 768-88 1 ) to create a bait vector. A cDNA library of cloned S. aureus sequences 
which have been engineered into a plasmid where the S. aureus sequences are fused to 
the DNA binding domain of Gal4 is also generated. These plasmids are introduced 
alone, or in combination, into yeast strain Y190 - previously engineered with 
chromosomal ly integrated copies of the E. coli lacZ and the selectable HIS3 genes, 

30 both under Gal4 regulation (Durfee, T., Becherer, K., Chen, P.-L., Ych, S.-H., Yang, 
Y., Kilbum, A.E., Lee, W.-H., and Elledge, S.J. (1993). Genes & Dev. 7, 555-569). If 
the two proteins expressed in yeasl interact, the resulting complex will activate 
transcription from promoters containing Gal4 binding sites. A lacZ and His3 gene, 
each driven by a promoter containing Gal4 binding sites, have been integrated into the. „ 

35 genome of the host yeast system used for measuring protein-protein interactions." Such 
a system provides a physiological environment in which to detect potential protein 
interactions. This system has been extensively used to identify novel protein-protein 
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interaction partners and to map the sites required for interaction (for example, to 
identify interacting partners of translation factors (Qiu, H. f Garcia-Barrio, M.T., and 
Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-2711), transcription factors 
(Katagiri, T., Saito,R, Shinohara, A., Ogawa,H., Kamada,N., Nakamura ,Y., and 
5 Miki, Y. (1998). Genes. Chromosomes & Cancer 21, 217-222), and proteins involved 
in signal transduction (Endo, T.A., Masuhara, M., Yokouchi, M., Suzuki, R., 
Sakamoto, H., Mitsui, K., Matsumoto, A., Tanimura,S., Ohtsubo, M., Misawa,R, 
Miyazaki, T., Leonor N., Taniguchi, T., Fujita,T M Kanakura, Y M Komiya,S., and 
Yoshimura, A. Nature. 387, 921-924). This approach has also been used in many 
1 0 published reports to identify interaction between mammalian viral and mammalian 
cell proteins. 

For example, the non-structural protein NS1 of parvovirus is essential for viral 
DNA amplification and gene expression and is also the major cytopathic effector of 
these viruses. A yeast two-hybrid screen with NS1 identified a novel cellular protein 

1 5 of unknown function that interacts with NS- 1 , called SGT, for small glutamine-rich 
tetratricopeptide repeat (TPR)-containing protein (Cziepluch C. Kordes E. Poirey R. 
GrewenigA. Rommelaere, J, and Jauniaux JC. (1998)7 Virol 72, 4149-4156). In 
another screen, the adenovirus E3 protein was recently shown to interact with a novel 
tumor necrosis factor alpha-inducible protein and to modulate some of the activities of 

20 E3 (Li Y. Kang J. and Horwitz M.S. (1998). Mol & Cell Biol. 18, 1601-1610). In yet 
another recent screen, the herpes simplex virus 1 alpha regulatory protein ICPO was 
found to interact with (and stabilize) the cell cycle regulator cyclin D3 (Kawaguchi Y. 
VanSantC. and RoizmanB.( 1997). J Virol. 71,7328-7336). 

Another two-hybrid system for identifying proteinrprotein interactions is 

25 commercially available from STRATEGENE™ as the CYTO-TRAP™ system 
(Chang et al., Strategies Newsletter 1 1(3), 65-68 (1998)(from Stratagene)). The 
system is a yeast-based method for detecting protein:protein interactions in vivo, using 
activation of the Ras signal transduction cascade by localizing a signal pathway 
component, human Sos (hSos), to its activation site in the yeast plasma membrane. 

30 The system uses a temperature-sensitive Saccharomyces cerevisiae mutant, strain 
cdc25H, which contains a point mutation at amino acid residue 1328 of the cdc25 
gene. This gene encodes a guanyi nucleotide exchange factor which binds and 
activates Ras, leading to cell growth. The mutation in the cdc25 gene prevents host 
growth at 37°C, but at a permissive temperature of 25°C, growth is normal. The 

35 system utilizes the ability of (hSos) to complement the cdc25 defect and activate the 
yeast Ras signaling pathway. Once (hSos) is expressed and localized to the plasma 
membrane, the cdc25H yeast strain grows at 37°C. Localizing hSos to the plasma 
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membrane occurs through a protein:protein interaction. A protein of interest, or bait, 
is expressed as a fusion protein with hSos. The library, or target proteins are 
expressed with the myristylation membrane-localization signal. The yeast cells are 
then incubated under restrictive conditions (37°C). If the bait and the target protein 
5 interact, the hSos protein is recruited to the membrane, activating the Ras signaling 
pathway and allowing the cdc25H yeast strain to grow at the restrictive temperature. 

The protein targets of phage inhibitory ORFs can also be identified using 
bacterial genetic screens. One approach involves the overexprcssion of a phage 
inhibitory protein in mutagenized bacterial host species, followed by plating the cells 

1 0 and searching for colonies that can survive the antimicrobial activity of the inhibitory 
ORF. These colonies are then grown, their DNA extracted, and cloned into an 
expression vector that contains a replicon of a different incompatibility group from 
the plasmid expressing the original ORF. This library is then introduced into a wild- 
type host bacterium in conjunction with an expression vector driving synthesis of the 

1 5 phage ORF, followed by selection for surviving bacteria. Thus, bacterial DNA 

fragments from the survivors presumably contain a DNA fragment from the original 
mutagenized host bacterial genome that can protect the cell from the antimicrobial 
activity of the inhibitory phage ORF. This fragment can be sequenced and compared 
with that of the bacterial host to determine in which gene the mutation lies. This 

20 approach enables one to determine the targets and pathways that are affected by the 
killing function. 

A second approach is based on identifying proteinrprotein interactions 
between the phage ORF product and bacterial S. aureus, e.g., proteins using a 
biochemical approach based, for example, on affinity chromatography. This approach 

25 has been used, for example, to identify interactions between lambda phage proteins 
and proteins from their E. coli host (Sopta, M, Carthew, R.W., and Greenblatt, J. 
(1985)7. Biol Chem. 260, 10353-10369). The phage ORF is fused to a peptide tag 
{e.g. glutathione-S-transferase ("GST*), 6xHIS, ("HIS") and/or calmodulin binding 
protein ("CPB")) within a commercially available plasmid vector that directs high 

30 level expression on induction of a suitably responsive promoter driving the fusion's 
expression. The translated fusion protein is expressed in E. coli t purified, and 
immobilized on a solid phase matrix via, for example the tag. Total cell extracts from 
the host bacterium, e.g., S. aureus, are then passed through the affinity matrix 
containing the immobilized phage ORF fusion protein; host proteins retained on the . .. 

35 column are then eluted under different conditions of ionic strength, pH, detergents 
etc., and characterized by gel electrophoresis and other techniques. Appropriate 
controls are run to guard against nonspecific binding to the resin. Target proteins thus 
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recovered should be enriched for the phage protein/peptide of interest and are 
subsequently electrophoretically or otherwise separated, purified, sequenced, or 
biochemical iy analyzed. Usually sequencing entails individual digestion of the 
proteins to completion with a protease (e.g. -trypsin), followed by molecular mass and 
5 amino acid composition and sequence determination using, for example, mass 
spectrometry, e.g., by MALDI-TOF technology (Qin, J., Fenyo, D., Zhao, Y., Hall, 
W.W., Chao, D.M., Wilson, C.J., Young, R.A. and Chait, B.T. (1997). Anal. Chem. 
69, 3995-4001). 

The sequence of the individual peptides from a single protein are then 
1 0 analyzed by the bioinformatics approach described above to identify the S. aureus 
protein interacting with the phage ORF. This analysis is performed by a computer 
search of the S. aureus genome for an identified sequence. Alternatively, all tryptic 
peptide fragments of the S. aureus genome can be predicted by computer software, 
and the molecular mass of such fragments compared to the molecular mass of the 
1 5 peptides obtained from each interacting protein eluted from the affinity matrix. The 
responsible gene sequence can be obtained, for example by using synthetic degenerate 
nucleic acid sequences to pull out the corresponding homologous bacterial sequence. 
Alternatively, antibodies can be generated against the peptide and used to isolate 
nascent peptide/mRNA transcript complexes, from which the mRNA can be reverse 
20 transcribed, cloned, and further characterized using the procedures discussed herein. 

A variety of other binding assay methods are known in the art and can be used 
to identify interactions between phage proteins and bacterial proteins or other bacterial 
cell components. Such methods that allow or provide identification of the bacterial 
component can be used in this invention for identifying putative targets. 
25 Validation of the interaction between the phage ORF product and the bacterial 

proteins or other components can be obtained by a second independent assay {e.g., 
co-immunoprecipitation or protein-protein crosslinking experiments (Qiu, H., Garcia- 
Barrio, M.T., and Hinnebusch, A.G. (1998). Mol & Cell Biology 18, 2697-27 11; 
Brown, S. and Blumenthal, T. (1976). Proc. Natl. Acad Set. USA 73, 1 131-1 135)). 
30 Finally, the essential nature of the identified bacterial proteins is preferably 

determined genetically by creating a constitutive or inducible partial or complete loss- 
of-runcuon mutation in the gene encoding the identified interacting bacterial protein. 
This mutant is then tested for bacterial survival and replication. 

The protein target of the phage inhibitor function can also be identified using a .. 
35 genetic approach. Two exemplary approaches will be delineated here. The firsT ~ " 
approach involves the overexpression of a predetermined phage inhibitor protein in 
mutagenized host bacteria, e.g., S. aureus, followed by plating the cells and searching 
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for colonies that can survive the inhibitor. These colonies will then be grown, their 
DNA extracted and cloned into an expression vector that contains a replicon of a 
different incompatibility group, and preferably having a different selectible marker 
than the plasmid expressing the phage inhibitor. Thus, host DNA fragments from the 
5 mutant that can protect the cell from phage ORF inhibition can be sequenced and 
compared with that of the bacterial host to determine in which gene the mutation lies. 
This approach allows rapid determination of the targets and pathways that are affected 
by the inhibitor. 

Alternatively, the bacterial targets can be determined in the absence of 
1 0 selecting for mutations using an approach known as "multicopy suppression''. In this 
approach, the DNA from the wild type host is cloned into an expression vector that 
can coexist, as previously described, with one containing a predetermined phage 
inhibitor. Those plasmids that contain host DNA fragments and genes that protect the 
host from the phage inhibitor can then be isolated and sequenced to identify putative 
1 5 targets and pathways in the host bacteria. 

Regardless of the specific mode of identification, screening assays may 
additionally utilize gene fusions to specific "reporter genes" to identify a bacterial 
gene(s) whose expression is affected when the host target pathway is affected by the 
phage inhibitor. Such gene fusions can be used to search a number of small molecule 
• 20 compounds for inhibitors that may affect this pathway and thus cause cell inhibition. 
This approach will allow the screening of a large number of molecules on petri dishes 
or 96-weIl format by monitoring for a simple color change in the bacterial colonies. 
In this manner, we can validate host targets and classes of compounds for further 
study and clinical development. These inhibitors also represent lead compounds for 
25 the development of other antibiotics. 

Bioinformatics and comparative genomics are preferably then applied to the 
identified bacterial gene products to predict biochemical function. The biochemical 
activity of the protein can be verified in vitro in cell free assays or in vivo in intact 
cells. In vitro biochemical assays utilizing cell-free extracts or purified protein are 
30 established as a basis for the screening and development of inhibitors. 

These inhibitors, preferably small molecule inhibitors, may comprise peptides, 
antibodies, products from natural sources such as fungal or plant extracts or small 
molecule organic compounds. In general, small molecule organic compounds are. 
preferred. These compounds may, for example, be identified within large compound 
35 libraries, including combinatorial libraries. For example, a plurality of compounds, 
preferably a large number of compounds can be screened to determine whether any of 
the compounds binds or otherwise disrupts or inhibits the identified bacterial target. 



WO 00/32825 



PCT/IB99/02040 



Compounds identified as having any of these activities can then be evaluated further 
in cell culture and/or animal model systems to determine the pharmacological 
properties of the compound, including the specific anti-microbial ability of the 
compound. 

5 For mixtures of natural products, including crude preparations, once a 

preparation or fraction of a preparation is shown the have an anti-microbial activity, 
the active substance can be isolated and identified using techniques well known in the 
art, if the compound is not already available in a purified form. 

Identified compounds possessing anti-microbial activity and similar 
1 0 compounds having structural similarity can be further evaluated and, if necessary, 
derivatized according to synthesis and/or modification methods available in the art 
selected as appropriate for the particular starting molecule. 

Perivatization of identified anti-micrpbials 
1 5 In cases where the identified anti-microbials above might represent pepttdal 

compunds, the in vivo effectiveness of such compounds may be advantageously 
enhanced by chemical modification using the natural polypeptide as a starting point 
and incorporating changes that provide advantages for use, for example, increased 
stability to proteolytic degradation, reduced antigenicity, improved tissue penetration, 

20 and/or improved delivery characteristics. 

In addition to active modifications and derivative creations, it can also be 
useful to provide inactive modifications or derivatives for use as negative controls or 
introduction of immunologic tolerance. For example, a biologically inactive 
derivative which has essentially the same epitopes as the corresponding natural 

25 antimicrobial can be used to induce immunological tolerance in a patient being 

treated. The induction of tolerance can then allow uninterrupted treatment with the 
active anti-microbial to continue for a significantly longer period of time. 

Modified anti-microbial polypeptides and derivatives can be produced using a 
number of different types of modifications to the amino acid chain. Many such 

30 methods are known to those skilled in the art. The changes can include, for example, 
reduction of the size of the molecule, and/or the modification of the amino acid 
sequence of the molecule. In addition, a variety of different chemical modifications of 
the naturally occurring polypeptide can be used, either with or without modifications 
to the amino acid sequence or size of the molecule. Such chemical modifications can, 

35 for example, include the incorporation of modified or non-natural amino acids ornon- 
amino acid moieties during synthesis of the peptide chain, or the post-synthesis 
modification of incorporated chain moieties. 
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The oligopeptides of this invention can be synthesized chemically or through 
an appropriate gene expression system. Synthetic peptides can include both naturally 
occurring amino acids and laboratory synthesized, modified amino acids. 

Also provided herein are functional derivatives of anti-microbial proteins or 
5 polypeptides. By "functional derivative" is meant a "chemical derivative," 

"fragment," "variant," "chimera," or "hybrid" of the polypeptide or protein, which 
terms are defined below. A functional derivative retains at least a portion of the 
function of the protein, for example reactivity with a specific antibody, enzymatic 
activity or binding activity. 

10 A "chemical derivative" of the complex contains additional chemical moieties 

not normally a part of the protein or peptide. Such moieties may improve the 
molecule's solubility, absorption, biological half-life, and the like. The moieties may 
alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, and the like. Moieties capable of mediating 

15 such effects are disclosed in Alfonso and Gennaro ( 1 995). Procedures for coupling 
such moieties to a molecule are well known in the art. Covalent modifications of the 
protein or peptides are included within the scope of this invention. Such 
modifications may be introduced into the molecule by reacting targeted amino acid 
residues of the peptide with an organic derivatizing agent that is capable of reacting 

20 with selected side chains or terminal residues, as described below. 

Cysteinyl residues most commonly are reacted with alpha-haloacetates (and 
corresponding amines), such as chloroacetic acid or chloroacetamide, to give 
carboxymethyl or carboxyamidomethyl derivatives. Cysteinyl residues also are 
derivatized by reaction with bromotrifluoroacetone, chloroacetyl phosphate, N- 

25 alkylraaleimides, 3-niiro-2-pyridyl disulfide, methyl 2-pyridyl disulfide, p-chloro- 
mercuribenzoate, 2-chloromercuri-4-nitrophenol, or chloro-7-nitrobenzo-2-oxa-l,3- 
diazole. 

Histidyl residues are derivatized by reaction with diethylprocarbonate at pH 
5.5-7.0 because this agent is relatively specific for the histidyl side chain. Para- 
30 bromophenacyl bromide also is useful; the reaction is preferably performed in 0.1 M 
sodium cacodylate at pH 6.0. 

Lysinyl and amino terminal residues are reacted with succinic or other 
carboxylic acid anhydrides. Derivatization with these agents has the effect of 
reversing the charge of the lysinyl residues. Other suitable reagents for derivatizing 
35 primary amine- containing residues include imidoesters such as methyl _ * 
picolinimtdatc; pyridoxal phosphate; pyridoxal; chloroborohydride; 
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trinitrobenzenesulfonic acid; O-methylisourea; 2,4 pentanedione; and transarninase- 
catalyzed reaction with glyoxylate. 

Arginyl residues are modified by reaction with one or several conventional 
reagents, among them phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and 
5 ninhydrin. Derivatization of arginine residues requires that the reaction be performed 
in alkaline conditions because of the high plC, of the.guanidine functional group. 
Furthermore, these reagents may react with the groups of lysine as well as the arginine 
alpha-amino group. 

Tyrosyl residues arc well-known targets of modification for introduction of 
10 spectral labels by reaction with aromatic diazonium compounds or tetranitromethane. 
Most commonly, N-acetylimidizol and tetranitromethane are used to form O-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. 

Carboxyl side groups (aspartyl or glutamyl) are selectively modified by 
reaction carbodiimide (R'-N-C-N-R') such as l-cyclohexyl-3-(2-morpholinyl(4-ethyl) 
15 carbodiimide or 1 -ethyl-3-(4-azonia-4,4-dimethylpentyl) carbodiimide. Furthermore, 
aspartyl and glutamyl residues are converted to asparaginyl and glutaminyl residues 
by reaction with ammonium ions. 

Glutaminyl and asparaginyl residues are frequently deamidated to the 
corresponding glutamyl and aspartyl residues. Alternatively, these residues are 
20 deamidated under mildly acidic conditions. Either form of these residues falls within 
the scope of this invention. 

Derivatization with bifunctional agents is useful, for example, for cross- 
linking component peptides to each other or the complex to a water-insoluble support 
matrix or to other macromolecular carriers. Commonly used cross-linking agents 
25 include, for example, 1 , 1 -bis (diazoacetyl)-2-phenylethane, glutaraldehyde, N- 

hydroxysuccinimide esters, for example, esters with 4-azidosalicylic acid, homobi- 
functional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), and bifunctional maleimides such as bis-N- 
maleimido-1 ,8-octane. Derivatizing agents such as methyl-3-[p-azidophenyl) 
30 dithiolpropioimidate yield photoactivatable intermediates that are capable of forming 
crosslinks in the presence of light. Alternatively, reactive water-insoluble matrices 
such as cyanogen bromide-activated carbohydrates and the reactive substrates 
described in U.S. Patent Nos. 3,969,287; 3,691,016; 4,195,128; 4,247,642; 4,229,537; 
and 4,330,440 are employed for protein immobilization. 
35 Other modifications include hydroxylation of proline and lysine, - ~~ 

phosphorylation of hydroxyl groups of scryl or threonyl residues, methylation of the 
alpha-amino groups of lysine, arginine, and histidine side chains (Creighton, T.E., 
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Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, 
pp. 79-86 (1983)), acetylation of the N-terminal amine, and, in some instances, 
amidation of the C-terminal carboxyl groups. 

Such derivatized moieties may improve the stability, solubility, absorption, 
5 biological half life, and the like. The moieties may alternatively eliminate or attenuate 
any undesirable side effect of the protein complex. Moieties capable of mediating 
such effects are disclosed, for example, in Alfonso and Gennaro (1995). 

The term "fragment" is used to indicate a polypeptide derived from the amino 
acid sequence of the protein or polypeptide having a length less than the full-length 
10 polypeptide from which it has been derived. Such a fragment may, for example, be 
produced by proteolytic cleavage of the full-length protein. Preferably, the fragment 
is obtained recombinantly by appropriately modifying the DNA sequence encoding 
the proteins to delete one or more amino acids at one or more sites of the C-terniinus, 
N-terminus, and/or within the native sequence. 
1 5 Another functional derivative intended to be within the scope of the present 

invention is a "variant" polypeptide that either lacks one or more amino acids or 
contains additional or substituted amino acids relative to the native polypeptide. The 
variant may be derived from a naturally occurring polypeptide by appropriately 
modifying the protein DNA coding sequence to add, remove, and/or to modify codons 
20 for one or more amino acids at one or more sites of the C-terrninus, N-terminus, 
and/or within the native sequence. 

A functional derivative of a protein or polypeptide with deleted, inserted 
and/or substituted amino acid residues may be prepared using standard techniques 
well-known to those of ordinary skill in the art. For example, the modified 
25 components of the functional derivatives may be produced using site-directed 
mutagenesis techniques (as exemplified by Adclman et al., 1983, DNA 2: 183; 
Sambrook et al., 1989) wherein nucleotides in the DNA coding sequence are modified 
such that a modified coding sequence is produced, and thereafter expressing this 
recombinant DNA in a prokaryotic or eukaryotic host cell, using techniques such as 
30 those described above. Alternatively, components of functional derivatives of 
complexes with amino acid deletions, insertions and/or substitutions may be 
conveniently prepared by direct chemical synthesis, using methods well-known in the 
art. 

Insofar as other anti-microbial inhibitor compounds identified by the invention . 
35 described herein may not be peptidal in nature, other chemical techniques existlb 
allow their suitable modification, as well, and according the desirable principles 
discussed above. 
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Administration and Pharmaceutical Compositions 

For the therapeutic and prophylactic treatment of infection, the preferred 
method of preparation or administration of anti-microbial compounds will generally 
5 vary depending on the precise identity and nature of the anti-microbial being 

delivered. Thus, those skilled in the art will understand that administration methods 
known in the art will also be appropriate for the compounds of this invention. 

The particularly desired anti-microbial can be administered to a patient either 
by itself, or in pharmaceutical compositions where it is mixed with suitable carriers or 
1 0 excipient(s). In treating an infection, a therapeutically effective amount of an agent or 
agents is administered. A therapeutically effective dose refers to that amount of the 
compound that results in amelioration of one or more symptoms of bacterial infection 
and/or a prolongation of patient survival or patient comfort. 

Toxicity, therapeutic and prophylactic efficacy of anti-microbials can be 

1 5 determined by standard pharmaceutical procedures in cell cultures and/or 

experimental organisms such as animals, e.g., for determining the LD^ (the dose 
lethal to 50% of the population) and the ED J0 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio LDj^ED^. Compounds that 

20 exhibit large therapeutic indices are preferred. The data obtained from these eel] 

culture assays and animal studies can be used in formulating a range of dosage for use 
in humans. The dosage of such compounds lies preferably within a range of 
circulating concentrations thai include the ED^ with little or no toxicity. The dosage 
may vary within this range depending upon the dosage form employed and the route 

25 of administration utilized. 

For any compound identified and used in the method of the invention, the 
therapeutically effective dose can be estimated initially from cell culture assays. Such 
information can be used to more accurately determine useful doses in organisms such 
as plants and animals, preferably mammals, and most preferably humans. Levels in 

30 plasma may be measured, for example, by HPLC or other means appropriate for 
detection of the particular compound. 

The exact formulation, route of administration and dosage can be chosen by 
the individual physician in view of the patient's condition (see e.g. Fingl et. al., in The 
Pharmacological Basis of Therapeutics, 1975, Ch. 1 p.l). 

35 It should be noted that the attending physician would know how* and when to 

terminate, interrupt, or adjust adrninistration due to toxicity, organ dysfunction, or 
other systemic malady. Conversely, the attending physician would also know to 
adjust treatment to higher levels if the clinical response were not adequate (precluding 
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toxicity). The magnitude of an administered dose in the management of the disorder 
of interest will vary with the seventy of the condition to be treated and the route of 
administration. The severity of the condition may, for example, be evaluated, in part, 
by standard prognostic evaluation methods. Further, the dose and perhaps dose 
5 frequency, will also vary according to the age, body weight, and response of the 
individual patient. A program comparable to that discussed above also may be used 
in veterinary or phyto medicine. 

Depending on the specific infection target being treated and the method 
selected, such agents may be formulated and administered systemically or locally, i.e., 
10 topically. Techniques for formulation and administration may be found in Alfonso 
and Gennaro ( 1 995). Suitable routes may include , for example, oral, rectal, 
20 transdermal, vaginal, transmucosal, intestinal, parenteral, intramuscular, 

subcutaneous, or intramedullary injections, as well as intrathecal, intravenous, or 
intraperitoneal injections. 
1 5 For injection, the agents of the invention may be formulated in aqueous 

25 solutions, preferably in physiologically compatible buffers such as Hanks' solution, 

Ringer's solution, or physiological saline buffer. For transmucosal administration, 
penetrants appropriate to the barrier to be permeated are used in the formulation. 
Such penetrants are generally known in the art. 
20 Use of pharmaceutical^ acceptable carriers to formulate identified anti- 

microbials of the present invention into dosages suitable for systemic administration is 
within the scope of the invention. With proper choice of carrier and suitable 
manufacturing practice, the compositions of the present invention, in particular those 
formulated as solutions, may be administered parenterally, such as by intravenous 
35 25 injection. Appropriate compounds can be formulated really using pharmaceutic 

acceptable carriers well known in the art into dosages suitable for oral administration. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by 
40 a patient to be treated. 

30 Agents intended to be administered intracellularly may be administered using 

techniques well known to those of ordinary skill in the art For example, such agents 
may be encapsulated into liposomes, then administered as described above. 
45 Liposomes are spherical lipid bilayers with aqueous interiors. All molecules present 

in an aqueous solution at the time of liposome formation are incorporated into the 
35 aqueous interior. The liposomal contents are both protected from the external ~" 
microenvironment and, because liposomes fuse with cell membranes, are efficiently 
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delivered into the cell cytoplasm. Additionally, due to their hydrophobicity, small 
organic molecules may be directly administered intracellularly. 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
5 achieve the intended purpose. Determination of the effecti ve amounts is well within 
the capability of those skilled in the art. 

In addition to the active ingredients, these pharmaceutical compositions may 
contain suitable pharmaceutically acceptable carriers comprising excipicnts and 
auxiliaries which facilitate processing of the active compounds into preparations 

1 0 which can be used pharmaceutically. The preparations formulated for oral 

administration may be in the form of tablets, dragees, capsules, or solutions, including 
those formulated for delayed release or only to be released when the pharmaceutical 
reaches the small or large intestine. 

The pharmaceutical compositions of the present invention may be 

1 5 manufactured in a manner that is itself known, e.g. , by means of conventional mixing, 
dissolving, granulating, dragee-making, levitating, emulsifying, encapsulating, 
entrapping or lyophilizing processes. 

Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active anti-microbial compounds in water-soluble form. 

20 Alternatively, suspensions of the active compounds may be prepared as appropriate 
oily injection suspensions. Suitable lipophilic solvents or vehicles include fatty oils 
such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or triglycerides, 
or liposomes. Aqueous injection suspensions may contain substances which increase 
the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 

25 dextran. Optionally, the suspension may also contain suitable stabilizers or agents 
which increase the solubility of the compounds to allow for the preparation of highly 
concentrated solutions. 

Pharmaceutical preparations for oral use can be obtained by combining the 
active compounds with solid excipient, optionally grinding a resulting mixture, and 

30 processing the mixture of granules, after adding suitable auxiliaries, if desired, to 
obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as 
sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such 
as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum 
tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 

35 carboxymcthylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, , 

disintegrating agents may be added, such as the cross-linked polyvinyl pyrrolidone, 
agar, or alginic acid or a salt thereof such as sodium alginate. 
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Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopo! gel, polyethylene glycol, and/or titanium 
dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. 
5 Dyestuffs or pigments may be added to the tablets or dragee coatings for identification 
or to characterize different combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit 
capsules made of gelatin, as well as soft, sealed capsules made of gelatin and a 
plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active 

1 0 ingredients in admixture with filler such as lactose, binders such as starches, and/or 
lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft 
capsules, the active compounds may be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. 

1 5 The above methodologies may be employed either actively or prophylactically 

against an infection of interest. 

Computer-related Aspects and Embodiments 

In addition to the provision of compounds as chemical entities, nucleotide 

20 sequences, or fragments thereof at least 95%, preferably at least 97%, more preferably 
at least 99%, and most preferably at least 99.9% identical to phage inhibitor sequences 
can also be provided in a variety of additional media to facilitate various uses. 

Thus, as used in this section, "provided" refers to an article of manufacture, 
rather than an actual nucleic acid molecule, which contains a nucleotide sequence of 

25 the present invention; e.g., a nucleotide sequence of an exemplary bacteriophage or a 
sequence encoding a bacterial target or a fragment thereof, preferably a nucleotide 
sequence at least 95%, more preferably at least 99% and most preferably at least 
99.9% identical to such a bacteriophage or bacterial sequence, for example, to a 
polynucleotide of an unsequenced phage listed in Table 1, preferably of bacteriophage 

30 77 (S. aureus host) or bacteriophage 3A (S.aureus host) or bacteriophage 96 (S. 

aureus host). Such an article provides a large portion of the particular bacteriophage 
genome or bacterial gene and parts thereof (e.g., a bacteriophage open reading frame 
(ORF)) in a form which allows a skilled artisan to examine and/or analyze the 
sequence using means not directly applicable to examining the actual genome or gene. , 

35 or subset thereof as it exists in nature or in purified form as a chemical entity* ~" 
In one application of this aspect, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 
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readable media" refers to any medium that can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such 
as floppy discs, hard disc storage medium, magnetic tape; optical storage media such 
as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
5 categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used 
to create an article of manufacture which includes one or more computer readable 
media having recorded thereon a nucleotide sequence or sequences of the present 
invention. Likewise, it will be clear to those of skill how additional computer 
1 0 readable media that may be developed also can be used to create analogous 

manufactures having recorded thereon a nucleotide sequence of the present invention. 
20 As used herein, "recorded" refers to a process for storing information on 

computer readable medium. A skilled artisan can readily adopt any of the presently 
known methods for recording information on computer readable medium to generate 
1 5 manufactures comprising the nucleotide sequence information of the present 
25 invention. 

A variety of data storage structures are available to a skilled artisan for 
creating a computer readable medium having recorded thereon a nucleotide sequence 
of the present invention. The choice of the data storage structure will generally be 
20 based on the means chosen to access the stored information. In addition, a variety of 
data processor programs and formats can be used to store the nucleotide sequence 
information of the present invention on computer readable medium. The sequence 
information can, for example, be presented in a word processing test file, formatted in 
commercially available software such as WordPerfect and Microsoft Word, or 
35 25 represented in the form of an ASCII file, stored in a database application, such as 

DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data processor structuring formats (e.g. , text file or database) in order to obtain 
computer readable medium having recorded thereon the nucleotide sequence 
40 information of the present invention. 

30 Computer software is publicly available which allows a skilled artisan to 

access sequence information provided in a computer readable medium. Thus, by 
providing in computer readable form a nucleotide sequence of an unsequenced 
45 bacteriophage, such as an exemplary bacteriophage listed in Table I or of a sequence 

encoding a bacterial target or a fragment thereof, preferably a nucleotide sequence at - . 
35 least 95%, more preferably at least 99% and most preferably at least 99.9% identical 
to such a bacteriophage or bacterial sequence, for example, to a polynucleotide of 
bacteriophage 77 (S. aureus host) or bacteriophage 3A (S.aureus host) bacteriophage 
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96 (S. aureus host), bacteriophage 44AHJD (£ aureus host), bacteriophage Dp-1 
(Streptococcus pneumoniae host), or bacteriophage 182 (Enterococcus host) the 
present invention enables the skilled artisan to routinely access the provided sequence 
information for a wide variety of purposes. 
5 Those skilled in the art understand that software can implement a variety of 

different search or analysis software which implement sequence search and analysis 
algorithms, e.g., the BLAST (Altschul et al., J. Mol. Biol. 215:403410 (1990) and 
BLAZE (Brutlag et al., Comp. Chem 17:203-207 (1993)) search algorithms. For 
example, such search algorithms can be implemented on a Sybase system and used to 
1 0 identi fy open reading frames (ORFs) within the bacteriophage genome which contain 
homology to ORFs or proteins from other viruses, e.g, other bacteriophage, and other 
organisms, e.g., the host bacterium. Among the ORFs discussed herein are protein 
encoding fragments of the bacteriophage genomes which encode bacteria-inhibiting 
proteins or fragments. 

1 5 The present invention further provides systems, particularly computer-based 

systems, which contain the sequence information described. Such systems are 
designed to identify, among other things, useful fragments of the bacteriophage 
genomes. 

As used herein, "a computer-based system" refers to the hardware, software, 

20 and data storage media used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware of the computer-based systems of the 
present invention comprises a central processing unit (CPU), input device, output 
device, and data storage medium or media. A skilled artisan will readily recognize 
that any of the currently available general purpose computer-based system are suitable 

25 for use in the present invention, as well as a variety of different specialized or 
dedicated computer-based systems. 

As stated above, the computer-based systems of the present invention 
comprise data storage media having stored therein a nucleotide sequence of the 
present invention and the necessary hardware and software for supporting and 

30 implementing a search and/or analysis program. 

As used herein, "data storage media" refers to memory which can store 
nucleotide sequence information of the present invention, or a memory access means 
which can access manufactures having recorded thereon the nucleotide sequence 
information of the present invention. 

35 As used herein, "search program" refers to one or more programs whicK are 

implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 



WO 00/32825 PCT/IB99/02040 

74 

Search means are used to identify fragments or regions of the present gnomic 
sequences which match a particular target sequence or target motif. A variety of 
known algorithms are disclosed publicly and a variety of commercially available 
software for conducting search means are and can be used in the computer-based 
5 systems of the present invention. Examples of such software includes, but is not 
limited to, MacPattem (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan 
can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches and/or sequence analyses can be 
adapted for use in the present computer-based systems. 

1 0 As used herein in connection with sequence searches and analyses, a "target 

sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target 
sequence is, the less likely a target sequence will be present as a random occurrence in 
the database. Also, the target sequence length is preferably selected to include 

1 5 sequence corresponding to a biologically relevant portion of an encoded product, for 
example a region which is expected to be conserved across a range of source 
organisms. Preferably the sequence length of a target polypeptide sequence is from 5- 
100 amino acids, more preferably 7-50 or 7-100 amino acids, and still more preferably 
10-80 or 10-100 amino acids. Preferably the sequence length of a target 

20 polynucleotide sequence is from 1 5-300 nucleotide residues, more preferably from 2 1 - 
240 or 21-300, and still more preferably 30-150 or 30-300 nucleotide residues. 
However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may 
be of shorter length. Likewise, it may be desirable to search and/or analyze longer 

25 sequences. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) arc 
chosen based on a three-dimensional configuration which is formed upon the folding 
of the target motif. There are a variety of target motifs known in the art. Protein 
30 target motifs include, but are not limited to, enzymatic active sites and signal 
sequences. Nucleic acid target motifs include, but are not limited to promoter 
sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

A variety of structural formats for the input and output devices can be used to_ 
3 5 input and output the information in the computer-based systems of the presejir " 
invention. A preferred format for an output device ranks fragments of the 
bacteriophage or bacterial sequences possessing varying degrees of homology to the 
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target sequence or target motif. Such presentation provides a skilled artisan with a 
ranking of sequences which contain various amounts of the target sequence or target 
motif and identifies the degree of homology contained in the identified fragment 

A variety of comparing methods and/or devices and/or formats can be used to 
5 compare a target sequence or target motif with the sequence stored in data storage 
media to identify sequence fragments of the bacteriophage or bacterium in question. 
One skilled in the art can readily recognize that any one of the publicly available 
homology search programs can be used as the search program for the computer-based 
systems of the present invention. Of course, suitable proprietary systems that may be 

1 0 known to those of skill, or later developed, also may be employed in this regard. 

Figure 6 provides a block diagram of a computer system illustrative of 
embodiments of this aspect of present invention. The computer system 102 includes a 
processor 106 connected to a bus 104. Also connected to the bus 104 are a main 
memory 108 (preferably implemented as random access memory, RAM) and a variety 

1 5 of secondary storage devices 1 1 0, such as a hard drive 1 1 2 and a removable medium 
storage device 1 14. The removable medium storage device 1 14 may represent, for 
example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A 
removable storage medium 1 16 (such as a floppy disk, a compact disk, a magnetic 
tape, etc.) containing control logic and/or data recorded therein may be inserted into 

20 the removable medium storage device 1 14. The computer system 102 includes 

appropriate software for reading the control logic and/or the data from the removable 
medium storage device 114, once it is inserted into the removable medium storage 
device 114. 

A nucleotide sequence of the present invention may be stored in a well-known 
25 manner in the main memory 1 08, any of the secondary storage devices 1 10, and/or a 
removable storage medium 116; During execution, software for accessing and 
processing the sequence (such as search tools, comparing tools, etc.) reside in main 
memory 108, in accordance with the requirements and operating parameters of the 
operating system, the hardware system and the software program or programs. 
30 The data storage medium in which the sequence is embodied and the central 

processor need not be part of a single stand-alone computer, but may be separated so 
long as data transfer can occur. For example, the processor or processors being 
utilized for a search or analysis can be part of one general purpose computer, and the 
data storage medium can be part of a second general purpose computer connected to .a. 
35 network, or the data storage medium can be part of a network server. As anotner 
example the data storage medium can be part of a computer system or network 
accessible over telephone lines or other remote connection method. 
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EXAMPLES 

EMmphli — Growth of fta/^ A bacteriophage 77 and purification pf ggngmip 

5 The Staphylococcus aureus propagating strain (PS 77; ATCC #27699) was 

used as a host to propagate its respective phage 77 (ATCC U 27699-B1). Two rounds 
of plaque purification of phage 77 were performed on soft agar essentially as 
described in Sambrook et al (1989). Briefly, the PS 77 strain was grown overnight at 
37°C in Nutrient broth [NB: 0.3% Bacto beef extract, 0.5% Bacto peptone (Difco 

10 Laboratories) and 0.5% NaCl (w/v)].The culture was then diluted 20x in NB and 
incubated at 37°C until the OD^ 2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 77 was subjected to 10-fold serial dilutions using 
phage buffer (1 mM MgS0 4 , 5 raM MgCl 2 , 80 mM NaCl and 0.1% Gelatin (w/v)) and 
10 ul of each dilution was used to infect 0.5 ml of the cell suspension in the presence 

15 of 400 fig/ml CaCl 2 . After incubation of 1 5 min at room temperature (RT), 2 ml of 
melted soft agar kept at 45°C (NB supplemented with 0.6% agar) was added to the 
mixture and poured onto the surface of 1 00 mm nutrient agar plates (0.3% Bacto Beef 
extract, 0.5% Bacto peptone, 0.5% NaCl and 1.5% Bacto agar (w/v)). After overnight 
incubation at 30°C, a single plaque was isolated, resuspended in 1 ml of phage buffer 

20 by end over end rotation for 2 hrs at 20°C, and the phage suspension was diluted and 
used for a second infection as described above. After overnight incubation at 30°C, a 
single plaque was isolated and used as a stock. 

The propagation procedure for bacteriophage 77 was modified from the agar 
layer method of Swanstflrm and Adams (1951). Briefly, the PS 77 strain was grown to 

25 stationary phase overnight at 37°C in Nutrient broth. The culture was then diluted 
twenty-fold in NB and incubated at 37°C until the OD 540 = ,2. The suspension (15xl0 7 
Bacteria) was then mixed with 15x10 s plaque forming units (pfu) to give a ratio of 
100-bacteria/phage particle in the presence of 400 ug/ml of CaCI 2 . After incubation 
for 15 min at 20°C, 7.5 ml of melted soft agar (NB plus 0.6% agar) were added to the 

30 mixture and poured onto the surface of 1 50 mm nutrient agar plates and incubated 16 
hrs at 30 P C. To collect the phage plate iysate, 20 ml of NB were added to each plate 
and the soft agar layer was collected by scrapping off with a clean microscope slide 
followed by shaking of the agar suspension for 5 min to break up the agar. The 
mixture was then centrifuged for 10 min at 4,000 RPM (2,830xg) in a JA-10 rotor- * " 

35 (Beckman) and the supernatant fluid (Iysate) was collected and subjected toa 

treatment with 10 ug /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) PEG 8000 and 
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0.5 M of NaCl followed by incubation at 4°C for 16 hrs. The phage was recovered by 
centrirogaiion at 4,000 rpm (3,500xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (1 mM 
MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The phage suspension was 
5 extracted with 1 volume of chloroform and further purified by centrifugation on a 
cesium chloride step gradient as described in Sambrook et al. (1989), using a TLS 55 
rotor centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000xg) at 4°C. Banded phage was collected and ultraccntrifuged again on an 
isopyenic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,0O0xg) for 24 h at 

1 0 4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCl [pH 8] and 10 mM MgCl 2 . Phage DNA was prepared from the phage 
suspension by adding 20 mM EDTA, 50 mg/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 

1 5 phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris pH 8.0, ImM EDTA). 

EaiDBlS 2. DNA sequencing of Bacteriophage 77 genome 

Four micrograms of phage 77 DNA was diluted in 200 fil of TE (10 mM Tris, 
20 [pH 8.0], I mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator™, Fisher Scientific). Samples were sonicated under an 
amplitude of 3 urn with bursts of 5 s spaced by 15 s cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
25 as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 

agarose gel and purified using a commercial DNA extraction system according to the 
instructions of the manufacturer (Qiagen), with a final elution of 50 ^1 of 1 mM Tris 
(pH8.5). 

The ends of the sonicated DNA fragments were repaired with a combination of 
30 T4 DNA polymerase and the KJenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 ul) 
containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ug/ml BSA, 100 uM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 - - 
35 units of KJenow large fragment (New England Biolabs) for 1 5 min at room - 

temperature. The reaction was stopped by two phenol/chloroform extractions and the 
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DNA was precipitated with cthanol and the final DNA pellet was resuspended in 20 
Ml of H } 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Hinc II 
site of pKSH+ vector (New England Biolabs) dephosphorylated by treatment with calf . 
5 intestinal alkaline phosphatase (New England Biolabs)-treated pKS 11+ vector 

(Stratagene). A typical ligation reaction contained 100 ng of vector DNA, 2 to 5 ul of 
repaired sonicated phage DNA (50-100 ng) in a final volume of 20 ul containing 800 
units of T4 DNA ligase (New England Biolabs) and was incubated overnight at 16°C. 
Transformation and selection of bacterial clones containing recombinant plasmids was 
1 0 performed in £. coli DH 1 Op according to standard procedures (Sambrook et al., 
1989). 

Recombinant clones were picked from agar plates into 96-well plates 
containing 100 ul LB and 100 ug/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
1 5 flanking the hinc II cloning site of the pKS 11+ vector. PCR amplification of foreign 
insert was performed in a 15 ul reaction volume containing 10 mM Tris (pH 8.3), 50 
mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 uM each dNTP, and 
0.75 units Tag polymerase (BRL). The thermocycling parameters were as follows: 2 
min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 

20 denaturation at 94°C, 30 sec annealing at 57°C, and 2 min extension at 72°C, 

followed by a single extension step at 72°C for 1 0 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was 

25 determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer or ABI prism Big Dye™ terminator cycle sequencing 
ready reaction kit (Applied Biosystems). To ensure co-linearity of the sequence data 
and the genome, all regions of phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 

30 sequencing primer was selected and phage DNA was used directly as sequencing 

template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

Example 3, — Bioinformatic management ofprimarv nucleotide s equence from 
35 Phage 77. -:- - 



Phage 77 sequence contigs were assembled using Sequenchcr™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
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the contigs. Phage DNA was used directly as sequencing template employing ABI 
prism BIG DYE™ terminator cycle sequencing ready reaction kit The complete 
sequence of bacteriophage 77 is shown in Table 2. 

A software program was developed and used on the assembled sequence of 
5 bacteriophage 77 to identify all putative ORFs larger than 33 codons. Other ORF 
identification software can also be utilized, preferably programs which allow 
alternative start codons. The software scans the primary nucleotide sequence starting 
at nucleotide #1 for an appropriate start codon. Three possible selections can be made 
for defining the nature of the start codon; I) selection of ATG, II) selection of ATG or 

1 0 GTG, and in) selection of either ATG, GTG, TTG, CTG, ATT, ATC, and ATA. This 
latter initiation codon set corresponds to the one reported by the NCBI 
(htto://ww.ncbi.nlm.n^^ for the 

bacteria] genetic code. 

When an appropriate start codon is encountered, a counting mechanism is 

1 5 employed to count the number of codons (groups of three nucleotides) between this 
start codon and the next stop codon downstream of it. If a threshold value of 33 is 
reached, or exceeded, then the sequence encompassed by these two codons (start and 
stop codons) is defined as an ORF. This procedure is repeated, each time starting at 
the next nucleotide following the previous stop codon found, in order to identify all 

20 the other putative ORFs. The scan is performed on all three reading frames of both 
DNA strands of the phage sequence. 

Sequence homology (BLAST) searches for each ORF are then carried out 
using an implementation of BLAST programs, although any of a variety of different 
sequence comparison and matching programs can be utilized as known to those 

25 skilled in the art. Downloaded public databases used for sequence analysis include: 

i) non-redundant GenBank (ftp://ncbi.nlm.nih.gov/blast/db/nr^), 

ii) Swissprot (ftp://ncbi.nlm.nih.gOv/blast/db/swissprot.Z); 

iii) vector (ftp://ncbi.nlm .nih.gov/bIast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gOv/blast/db/pdbaa.Z); 

30 v) S. aureus NCTC 8325 (ftp^/ftp.genome.ou.edu/pub/staph/staph- 1 k.fa); 

vi) streptococcus pyogenes (ftp://ftp.genome.ou.edu/pub/strep/strep-lk.fa); 

vii) Streptococcus pneumoniae 

(ftp^/ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1 1 2 1 97.Z); 

viii) Mycobacterium tuberculosis CSU89 

35 (np^/ftp.tigr.org/pub/data/m_tuberculosisyTB_091097.Z)and , ~ : ~ ~ 

ix) pseudomonas aeruginosa (httn://www, genome. washington.edu/Dseudo/data.htmlV 
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The results of the homology searches performed on the ORFs is shown in 
Table 5. 

Example 4, — Stifrcloning of PpcierjpnhaRe V QRFs imp a Staph A inducible 

5 expression PYStCTi, 

The shuttle vector pT0021 t in which the firefly luciferase (lucFF) expression 
is controlled by the ars (arsenite) promoter/operator (Tauriainen et al M 1997), was 
modified in the following fashion. Two oligonucleotides corresponding to a short 
antigenic peptide derived from the heainaglutinin protein of influenza virus (HA 
10 epitope tag) were synthesized (Field et al., 1988). The sense strand HA tag sequence 
(with BamHl, Sail and Hinalll cloning sites) is: 

5 , -gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-3 , 
(where upper case letters denote the nucletotide sequence of the HA tag); the antisense 
strand HA tag sequence (with a HindlU cloning site) is: 

15 5 ' -agctTC AGCTGGCGTAGTCTGGGACGTCGTATGGGTAaagcttggtcgaccgg-3 ' 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT002 1 vector which had been 
digested with BamHl and HindUl* This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 

20 inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT002l to generate pTHA is shown in Fig. 1A. 

Each ORF, encoded by Bacteriophage 77, larger than 33 amino acids and 
having a Shinc-Dalgarno sequence upstream of the initiation codon was selected for 
functional analysis for bacterial inhibition. In total, 98 ORFs were selected and 

25 screened as detailed below. A list of these is presented in Table 3. Each individual 
ORF, from initiation codon to last codon (excluding the stop codon), was amplified 
from phage genomic DNA using the polymerase chain reaction (PCR). For PCR 
amplification of ORFs, each sense strand primer targets the initiation codon and is 
preceded by a BamHl restriction site ^eg ggatec 3 *) and each antisense oligonucleotide 

30 targets the pentultimate codon (the one before the stop codon) of the ORF and is 

preceded by a Sal I restriction site ( 5 gcgtc gac cg 3 ). The PCR product of each ORF was 
gel purified and digested with BamHl and Sail. The digested PCR product was then 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DH10P(as described _ - - 

35 above). As a result of this manipulation, the HA tag is set inframe with the ORF and is 
positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones were picked and their insert sizes were confirmed by PCR analysis 
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using primers flanking the cloning site. The names and sequences of the primers that 
were used for the PCR amplification were: HAF: 

5 'TATTATCCAAAACTTGAACA ,, ; HAR: 5 CGGTGGTATATCCAGTGATT 3 '. The 
sequence integrity of cloned ORFs was verified directly by DNA sequencing using 
5 primers HAF and HAR. In cases where verification of ORF sequence could not be 
achieved by one pass with the sequencing primers, additional internal primers were 
selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et ah, 1983) was used as a 
recipient for the expression of recombinant plasmids. Electoporation was performed 
1 0 essentially as previously described (Schenk and Laddaga, 1 992). Selection of 

recombinant clones was performed on Luria-Broth agar (LB-agar) plates containing 
30 ng/ml of kanamycin. 

For each ORF introduced in the pTHA plasmid, 3 independent transformants 
were isolated and used to individually inoculate cultures in 5 ml of TSB containing 
1 5 30ug/ml kanamycin, followed by growth to saturation ( 1 6 hrs at 30°C). An aliquot of 
this stationary phase culture was used to generate a frozen glycerol stock of the 
transformant ( stored at - 80°C). The remaining culture was used for plasmid DNA 
extraction. Bacterial cells were harvested by centrifugation at 3000 x g at 22°C for 5 
min. The pellet was resusp ended in 200 fxl 25% sucrose containing 25U/ml of 

20 lysostaphin and incubated for 15 min at 37°C. Then, 400^1 of alkaline SDS solution 
(3% SDS, 0.2N NaOH) were added, well mixed and incubated for 7 min at room 
temperature. After the alkaline SDS treatment, 300ul of ice-cold 3M sodium acetate 
pH 4.8 were added, and the mix is immediately spun at 13000g for 15 min at room 
temperature. The supernatant was transferred to a new 1.5 ml conical centrifuge tube 

25 and 650ul of isopropanol (stored at room temperature) were added. The mix was then 
centrifuged at 13,000 x g for 5 min. The supernatant fluid was discarded, the pellet 
washed with 70% ethanol, and resuspended in 320 \ii sterile distilled water. 

The presence of individual phage 77 ORF DNA inserts in the plasmid was 
verified by PCR amplification using 1 .5 ^1 transformant miniprep DNA in a PCR 

30 with primers flanking the cloning site of ORF in pTHA vector (HAF and HAR). The 
composition of the PCR reaction and the cycling parameters are identical to those 
employed for library screening described above. 

Example ft — Functional assay for bacterial inhibitory a ctivity of bacteriophage 77 
35 ORFs. ^ — - 

The anti-microbial activity of individual phage 77 ORFs was monitored by 
two growth inhibitory assays, one on solid agar medium, the other in liquid medium. 



WO 00/32825 



PCT/1B99/02040 



82 

In general, Staphylococcus bacteria transformed with expression plasmids containing 
individual ORFs were grown in normal TSA medium and stored in 1 9% glycerol. At 
predetermined times, arsenite was added to the culture to induce transcription of the 
phage 77 ORFs cloned immediately downstream from an arsenite-inducible promoter 
S in the pTHA expression plasmid. 

The effect of ORF induction on bacterial growth characteristics was then 
monitored and quantitated. The growth inhibition assay on solid medium was 
performed by streaking pTHA/ORF containing S. aureus transform ant onto LB-Kn 
and TSA-Kn plates containing increasing concentrations of sodium arsenite (0; 2.5; 5; 

10 and 7.5 uM). Arsenite is used to induce the expression of cloned DNA in pTHA 
vector. In parallel, 3 ul of 1/1 0 and 1/100 dilutions of the frozen cultures of the 
pTHA/ORF trans forraants were spotted as single drops onto LB-Kn and TSA-Kn 
plates containing increasing concentration of sodium arsenite (0; 2.5; 5; and 7.5 uM). 
The plates were then incubated 16 hrs at 37°C, and the effect of arsenite-induced ORF 

1 5 expression on bacterial growth was monitored and quantitated by comparing the 
extent to that seen in control plates. As positive controls for growth inhibition,the 
holin/lysin genes of the Staphylococcus aureus phage Twort (Loessner et al. f 1998) 
was subcloned into the pTHA ars inducible vector and used. 

For the growth inhibition assay in liquid medium, stationary phase cultures 

20 were prepared by inoculating 2.5ml TSB-Kn with frozen S. aureus RN4220 
transformants containing phage 77 ORFs cloned in pTHA vector followed by 
incubation for 16 hrs at 37°C. These cultures were then diluted 1/100 in the same 
medium, and the bacteria were allowed to grow for 2 hrs at 37°C to reach early log 
phase. 1 50 ul of such culture were then mixed with 2.35 ml TSB-Kn medium with or 

25 without arsenite (the final concentration of arsenite in the medium was 0 or 5 uM 
arsenite). After 3.5 hrs incubation at 37°C with shaking at 250 rpm, 1 00 ul of 
bacterial culture was removed from each tube for OD 565 measurement. Serial ten-fold 
dilutions of the culture in buffered saline solution (0.85% NaCI) were then spotted 
onto TSB-Kn plates. The plates were incubated at 37°C 16 hrs and the number of 

30 surviving colonies counted the following day. The growth inhibitory property of 
individual ORFs was then quantitated by comparing CFU numbers under normal or 
arsenite-induction conditions. A schematic flow of the inhibition analysis is shown in 
Fig. 3 (also applicable to inhibition analysis for the other phage and bacteria pointed 
out herein). Inhibition results are shown in Figures 4A-C. 

35 ' ' _ 

Example fr Uentiflcation of Cecropin Signature Motif in Staphylococcus aureus 

Bacteriophage 3A ORF 
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The genome for S. aureus bacteriophage 3 A was determined and the sequence 
was analyzed essentially as described for bacteriophage 77 in the examples above. 
Upon blast analysis of the identified open reading frames of phage 3 A, the presence of 
an amino acid sequence corresponding to a cecropin signature motif was observed. 
5 This motif (WDGHKTLEK) is located at position aa 481-489. Cecropins were 
originally identified in proteins from the cecropia moth and are recognized as potent 
antibacterial proteins that constitute an important part of the cell- free immunity of 
insects. Cecropins are small proteins (31-39 amino acid residues) that are active 
against both Gram-positive and Gram-negative bacteria by disrupting the bacterial 
1 0 membranes. Although the mechanisms by which the cecropons cause cell death are 
not fully understood, it is generally thought to involve channel formation and 
membrane destabilization. 

The identification of a motif corresponding to a known inhibitor suggests that 
the product of ORF002 is also an inhibitory compound. Such inhibitory activity can 
1 5 be confirmed as described herein or by other methods known in the art. Confirmation 
of the inhibitory activity would indicate that the ORF product could serve as the basis 
for construction of mimetic compounds and other inhibitors directed to the target of 
the ORF002 product 

Boman & Hultmark, 1987, Ann. Rev. Microbiol. 41:103-126. 
20 Boman, 1991, Cell 65:205-207. 

Boman et ah, 1991, Eur. J. Bioichem. 201 :23-31 . 

Wang et al.,/ Biol. Chem. 273:27438-27448. 

Example 7. Growth of Staphylococcus aureus bacteriophage 44AHJD: 
25 Staphylococcus aureus propagating strain (PS 44A) (Felix d'Herelle Reference 

Centre #HER 1 101) was used as a host to propagate its respective phage 44AHJD 
(Felix d'Herelle Reference Centre #HER 101). Two rounds of plaque purification of 
phage 44AHJD were performed on soft agar essentially as described in Sambrook et 
at. (1989). Briefly, the Staphylococcus aureus PS strain was grown overnight at 37°C 
30 in Nutrient Broth [NB: 3 g Bacto Beef Extract, 5 g Bactopeptone per liter, (Difco 
Laboratories # 0003-17-8), supplemented with 0.5% NaCl]. The culture was then 
diluted 20 fold in NB and incubated at 37°C until an OD S40 of 0.2. In order to obtain 
single plaques, phage 44AHJD was subjected to 10-fold serial dilutions using the 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin) ana* IcTuJ 
35 were used to infect 0.5 ml of the cell suspension in the presence of 400 ng/inl of 
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CaCl z . After incubation of 15 min at room temperature, 2 ml of melted soft agar (NB 
supplemented with 0.6% of agar) were added to the mixture and poured onto the 
surface of 100 mm nutrient agar plates (3 g Bacto Beef extract, 5 g Bactopeptone, 
0.5% NaCI and 15 g of Bacto agar per liter (Difco Laboratories # 0001-17-0). After 
5 overnight incubation at 37°C, a single plaque was isolated, resuspended in 1ml of 
phage buffer by end over end rotation for 2 h at room temperature and the phage 
suspension was diluted and used for a second infection as described above. After 
overnight incubation at 37°C, a single plaque was isolated and used as a stock. 

Large scale purification of bacteriophage and preparation of phage DNA was 
10 as follows. 

The propagation method was carried out by using the agar layer method 
described by Swanstflrm and Adams (1 95 1 ). Briefly, the PS 44A strain was grown to 
stationary phase overnight at 37°C in Nutrient Broth. The culture was then diluted 20x 
inNB and incubated at 37°C until the A^ 0.2. The suspension (15xl0 7 Bacteria) 

1 5 was then mixed with 1 5x 1 0 5 phage particles to give a ratio of 1 00-bacteria/phage 
particle in the presence of 400 jig/ml of CaCl 3 . After incubation of 15 min at room 
temperature, 7.5 ml of melted soft agar were added to the mixture and poured onto the 
surface of 150 mm nutrient agar plates and incubated overnight at 37°C. To collect the 
lysate, 20 ml of NB were added to each plate and the soft agar layer was collected by 

20 scrapping off with a clean microscope slide and shaken vigorously for 5 min to break 
up the agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) 
using a JA- 10 rotor (Beckman) and the supernatant (lysate) is collected and subjected 
to a treatment with 10 ng/ml of DNase I and RNase A for 30 min at 37°C. To 
precipitate the phage particles, 10% (w/v) of PEG 8000 and 0.5 M of NaCI were 

25 added to the lysate and the mixture was incubated on ice for 16 h. The phage was 
recovered by centrifugauon at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R 
table top centrifuge (Beckman). 

The pellet was resuspended with 2 ml of phage buffer (1 mM MgS0 4 , 5 mM 
MgCl 2 , 80 mM NaCI and 0.1% Gelatin). The phage suspension was extracted with 1 

30 volume of chloroform and further purified by centrifugation on a preformed cesium 
chloride step gradient as described in Sambrook et al (1989), using a TLS 55.rdibr~ 
and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 h at 28,000 rpm 
(67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1.45 g/ml) at 40,000 rpm (64,000 x g) for 24 h at 
4 P C using a TLV rotor (Beckman). The phage was harvested and dialyzcd for 4 h at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 
Tris-HCI [pH 8] and 10 mM MgClj. Phage DNA was prepared from the phage 
suspension by adding 20 mM EDTA, 50 ug/ml Proteinase K and 0.5% SDS and 
incubating for 1 h at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 
then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCI [pH 8.0], ImM 
EDTA), 

Example 8. DNA sequencing of the Bacteriophage 44 AHJD genome. 

Four mg of phage DNA was diluted in 200 ul of TE pH 8.0 in a 1 .5 ml 
eppendorf tube and sonication was performed (550 Sonic Dismembrator, Fisher 
Scientific). Samples were sonicated under an amplitude of 3 ftm with bursts of 5 s 
spaced by 15 s cooling in ice/water for 3 to 4 cycles and size fractionated on 1% 
agarose gels. The sonicated DNA was then size fractionated by gel electrophoresis. 
Fractions ranging from 1 to 2 kbp were excised from the agarose gel and purified 
using a coommercial DNA extraction system according to the instructions of the 
manufacturer (Qiagen) and eluted in 50 ul of ImMTris-HCl [ pH 8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymearse and the Klenow fragment of £. coli DNA polymerase 1 as 
follows. Reactions were performed in a final volume of 100 ul containing DNA, 10 
mM Tris-HCI pH 8.0, 50 mM NaCl, 10 mM MgCl 2 , 1 mM DTT, 5 fig BSA, 100 uM 
of each dNTP and 1 5 units of T4 DNA polymerase (New England Biolabs) for 20 min 
at 12°C followed by addition of 12.5 units of Klenow fragment (New England 
Biolabs) for 1 5 min at room temperature. The reaction was stopped by two 
phenol/chloroform extractions and the DNA was ethanol precipitated and resuspended 
in20ulofH,O. 

Cloning of the sonicated phage DNA into pKSIl vector and transformation: 

Blunt-ended DNA fragments were cloned by ligation directly into the HincU.. *. m * 
site of the pkSII vector (Stratagene) dephosphorylated with calf intestinal alkaline 
phosphatase (New England Biolabs). A typical reaction contained 100 ng of vector, 2 
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to 5 ul of repaired sonicated phage DNA (50-1 00 ng) in a final volume of 20 pi 
containing 800 units of T4 DNA ligase (New England Biolabs) overnight at 16°C. 
Transformation and selection of positive clones was performed in the host strain 
DH10 p of £. colt using ampicillin as a selective antibiotic as described in Sambrook 
e/a/.(1989). 

Recombinant clones were picked from agar plates into 96- well plates 
containing 100 ml LB and 100 ng/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the HincW cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 ul reaction volume containing 10 mM Tris-HCl 
(pH 8.3), 50 mM KCI, 1.5 mM MgCl„ 0.02% gelatin, 1 mM primer, 187.5 uM each 
dNTP, and 0.75 units Taq polymerase (BRL). The thermocycling parameters were as 
follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealing at 58C, and 2 min extension at 72°C, followed 
by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 to 2 kbp 
were selected and plasmid DNA was prepared from the selected clones using the 
QIAprep™ spin rainiprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was determined 
using an ABT 377-36 automated sequencer with two types of chemistry: ABI prism 
BigDyc™ primer cycle sequencing (21M13 primer: #403055)(M13REV primer: 
#403056) or ABI prism BigDye™ terminator cycle sequencing ready reaction kit 
(Applied Biosystems; #4303152). To ensure co-linearity of the sequence data and the 
genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

Example 9. Bioinformatic management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3.1 software ' ~ 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
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prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; ' 
#4303152). The complete sequence of Staphylococcus aureus bacteriophage 44AHJD 
is shown in Table 16. 

A software program was used on the assembled sequence of bacteriophage 
5 44AHJD to identify all putative ORFs larger than 33 codons. The software scans the 
primary nucleotide sequence starting at nucleotide #1 for an appropriate start codon. 
15 Three possible selections can be made for defining the nature of the start codon; I) 

selection of ATG, II) selection of ATG or GTG, and III) selection of either ATG, 
GTG.TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
10 to the one reported by the NCBinitto://www.ncbi.nlm.nih.eov/hthin. 
20 Dost/TaxonomvAvprintgc?mode=c) for the bacterial genetic code. When an 

appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it If a threshold value of 33 is reached, or exceeded, 

25 

1 5 then the sequence encompassed by these two codons is defined as an ORF. This 

procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
30 performed on all three reading frames of both DNA strands of the phage sequence. 

The predicted ORFs for bacteriophage 44AHJD are listed in Tables 17 & 18. 
20 Sequence homology searches for each ORF were carried out using an 

implementation of blast programs. Downloaded public databases used for sequence 
35 analysis include: 

(i) non-redundant GenBank (ftp://ncbi.nlm.nih.gOv/blast/db/nr.Z), 
ii) Swissprot (ftp^/ncbi.nlm.nih.gov/blast/db/swissprot.Z); 
25 iii) vector (ftp://ncbi.nlm.nih.gov/blast/db/vectonZ); 
40 iv) pdbaa databases (ftp://ncbi.nlm.nih.gov^last/db/pdbaa»Z); 

v) Staphylococcus aureus NCTC 8325 (ftp://ftp.graome.ou.edu/pub/staph/staph- 
lk.fe); 

v\)Staphylococcuspyogenes((tp'M(\pAigr.OTQ/pub/dQXa/s _pneumoniae/gsp.contigs. 1121 

45 

30 97.Z); 

vii)PRODOM(fty://ftp.toulouse.iirafr/puW 
astgz); 

50 viii) DOMO (ftp://ftp.infobiogen.fr/pub/db/domo/); 
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ix) TREMBL (ftp://www.cxpasy.ch/databases/sp_tr_nrdh/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
44AHJD are shown in Tables 1 9 & 20. 

Example 10, Sub-Cloning of Bacteriopha ge 44 AHJD ORFs. 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 44 AHJD ORF sequence is 
inducible. For example, the shuttle vector pT0021, in which the firefly luciferase 
(lucFF) expression is controlled by the ars (arsenite) promoter/operator (Tauriainen et 
al. f 1997), can be modified in the following fashion. Two oligonucleotides 
corresponding to a short antigenic peptide derived from the heamaglutinin protein of 
influenza virus (HA epitope tag) were synthesized (Field et al., 1988). The sense 
strand HA tag sequence (with BamHl, Sail and HindUI cloning sites) is: 
5 '-gatcccggtcgaccaagcttTACCCATACGACGTCCCAGACTACGCCAGCTGA-S \ 
(where upper case letters denote the nucletotide sequence of the HA tag); the anrisense 
strand HA tag sequence (with a HindUI cloning site) is: 

5 '-agctTC AGCTGGCGTAGTCTGGG ACGTCGTATGGGTAaagcttggtcgaccgg-3 * 
(where upper case letters denote the sequence of the HA tag). The two HA tag 
oligonucleotides were annealed and ligated into pT0021 vector which had been 
digested with BamHl and HindUI. This manipulation resulted in replacement of the 
lucFF gene by the HA tag. This modified shuttle vector containing the arsenite 
inducible promoter, the arsR gene, and HA tag was named pTHA. A diagram 
outlining our modification of pT0021 to generate pTHA is shown in Fig. 1 A (another 
userful vector construct is shown in Fig. IB). 

Each ORF, encoded by Bacteriophage 44 AHJD, larger than 33 amino acids 
and having a Shine-Dalgamo sequence upstream of the initiation codon can be 
selected for functional analysis for bacterial inhibition. Each individual ORF, from 
initiation codon to last codon (excluding the stop codon), can be amplified from phage 
genomic DNA using the polymerase chain reaction (PCR). For PCR amplification of 
ORFs, each sense strand primer targets the initiation codon and is preceded by a 
BamHl restriction site ( s cggga__ 3 ) and each antisense oligonucleotide targets tjje~ : ~ ~ 
pentultimate codon (the one before the stop codon) of the ORF and is preceded by a 
Sal I restriction site ( 5 gcglcgaccg 3 ). The PCR product of each ORF can be gel 
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purified and digested with BamHl and Sail. The digested PCR product can then be 
gel purified using the Qiagen kit as described, ligated into BamHl and Sail digested 
pTHA vector, and used to transform E. coli bacterial strain DHlOP(as described 
above). As a result of this manipulation, the HA tag is set inrrame with the ORF and is 
positioned at the carboxy terminus of each ORF (pTHA/ORF clones). Recombinant 
pTHA/ORF clones will be picked and their insert sizes were confirmed by PCR 
analysis using primers flanking the cloning site. The following primers can be used 
for PCR amplification: HAF: 5 TATTATCCAAAACTTGAACA 3 '; HAR: 
5 CGGTGGTATATCCAGTGATT r . The sequence integrity of cloned ORFs can be 
verified directly by DNA sequencing using primers HAF and HAR. In cases where 
verification of ORF sequence can not be achieved by one pass with the sequencing 
primers, additional internal primers will be selected and used for sequencing. 

Staphylococcus aureus strain RN4220 (Kreiswirth et aL, 1983) will be used as 
a recipient for the expression of recombinant plasmids. Electoporation will be 
performed essentially as previously described (Schenk and Laddaga, 1992). Selection 
of recombinant clones will be performed on Luria-Broth agar (LB-agar) plates 
containing 30 ug/ml of kanamycin. 

Alternatively, a constitutive promoter can be used to drive expression of the 
introduced ORF, and compare cell growth to control bacterial ceils containing the 
parental vector lacking any introduced phage ORF. Recombinant plasmids will be 
introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al. f 1983) using 
electoporation as previously described (Schenk and Laddaga, 1992). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 
bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), can be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using4he " 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DH1 0. Recombinant clones are then picked and their insert sizes confirmed by 
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PCR analysis using primers flanking the cloning site as well as restriction digestion. 
The sequence fidelity of cloned ORFs can be verified by DNA sequencing using the 
same primers as used for PCR. In the cases that the verification of ORFs can not be 

10 

achieved by one path of sequencing using primers flanking the cloning site internal 
5 primers can be selected and used for sequencing. Recombinant plasmids can be 

introduced into Staphylococcus aureus strain RN4220 (Kreiswirth et al., 1983) using 
15 electoporation as previously described (Schenk and Laddaga, 1 992). 

Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
1 0 assessed, for example, in either of the two methods. 
20 1 . Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of S. aureus transformed cells containing phage 44 AHJD ORFs onto agar plates 
containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 nM). The 

25 

1 5 plates are incubated overnight at 37°C, after which a growth inhibition of the ORF 
trans form ants on plates that contain arsenite are compared to plates without arsenite. 
2. Quantification of growth inhibition in liouid medium 
30 Cells containing different recombinant plasmids can be grown for overnight at 

37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
20 then diluted to the mid log phase (OD^.2) with fresh media containing antibiotic 
and transferred to 96-well microtitration plates (100 ul/well). Inducer is then added at 
35 different final concentrations (ranging from 2.5 to 10 uM) and the culture incubated 

for an additional 2 hrs at 37°C. The effect of expression of the phage 44 AHJD ORFs 
on bacterial cell growth is then monitored by measuring the OD J40 and comparing the 
25 rate of growth to the culture not containing inducer. [As positive controls for growth 
40 inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 

Blasi,U. 1993 Virology #193: 1 033-1 036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
^ Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) can be 

30 subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic^ 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
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colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
5 presence of inducer as compared to when grown in the absence of inducer. 
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Example 1 1 . Growth of Enterococcus bacteriophage 1 82 and puri fication of 

genomic PNA- 

The Enterococcus propagating strain (PS) (Enterococcus sp. Group D f Felix 
d' Here lie Reference Centre #HER 1080) was used as host to propagate its respective 
1 0 phage 1 82 (Felix d'Herelle Reference Centre #HER 80). Two rounds of plaque 
purification of phage 182 were performed on soft agar essentially as described in 
Sarabrook et al. (1989). Briefly, the Enterococcus sp. PS strain was grown overnight 
at 37°C in Tryptic Soy Broth [TSB: 17 g Bacto tryptone, 3 g Bacto soytone, 2.5 g 
Bacto dextrose, 5 g Sodium chloride, and 2.5 g Dipotassium phosphate per liter 
25 15 (Difco Laboratories (#0370-17-3)]. The culture was then diluted 20 fold in TSB and 

incubated at 37°C until the OD*^ 0.2 (early log phase) with constant agitation. In 
order to obtain single plaques, phage 182 was subjected to 10 fold serial dilutions 
using the phage buffer ( 1 mM MgS0 4 , 5 mM MgCl 2 , 80 raM NaCl and 0. 1 % Gelatin 
30 (w/v)) and 1 0 I of each dilution was used to infect 0.5 ml of the bacterial cell 

20 suspension. After incubation at 1 5 min at 37°C, 2 ml of melted soft agar (TSB 

supplemented with 0.6% agar) was added to the mixture and poured onto the surface 
of 1 00 mm Trytic Soy Agar plates [TS A: 1 5 g Tryptone peptone, 5 g Soytone 
peptone, 5 g Sodium chloride and 15 g of Agar per liter (Difco Laboratories #0369- 
17)]. After overnight incubation at 37°C, a single plaque was isolated, resuspended in 
25 1 ml of phage buffer by end over end rotation for 2 his at room temperature, and the 
phage suspension was diluted and used for a second infection as described above. 
After overnight incubation at 37°C, a single plaque was isolated and used as a stock 
for all subsequent manipulations. 

The propagation procedure for bacteriophage 1 82 was modified from the agar 
45 30 layer method of Swanstdrra and Adams (1951). Briefly, the Enterococcus sp. PS 

strain was grown to stationary phase overnight at 37°C in TSB. The culture was then. - 
diluted 20 fold in TSB and incubated at 37°C until the A* 0 = 0.2. The suspension 
(15xl0 7 Bacteria) was then mixed with 15x10 s plaque forming units (pfu) to give a 
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ratio of 100-bacteria/pfu. After incubation of 15 min at 37°C, 7.5 ml of melted soft 
agar (TSB plus 0.6% agar) were added to the mixture and poured onto the surface of 
150 mm TSA plates and incubated 16 hrs at 37°C. To collect the plate lysate, 20 ml 
of TSB were added to each plate and the soft agar layer was collected by scrapping off 
5 with a clean microscope slide followed by vigorous shaking of the agar suspension for 
5 min to break up the agar. The mixture was then centrifuged for 10 min at 4,000 ipm 
15 (2,830 xg) using a JA- 1 0 rotor (Beckman) and the supernatant fluid (lysate) is 

collected and subjected to a treatment with 10 ug /ml of DNase 1 and RNasc A for 30 
min at 37°C To precipitate the phage particles, the phage suspension was adjusted to 
10 10% (w/v) of PEG 8000 and 0.5 M of NaCl followed by incubation at 4°C for 16 hrs. 
20 The phage was recovered by centrifiigation at 4,000 rpm (3,500 xg) for 20 min at 4°C 

on a GS-6R table top centrifuge (Beckman). The pellet was resuspended with 2 ml of 
phage buffer (1 mM MgS0 4 , 5 mM MgCl 2 , 80 mM NaCl and 0.1% Gelatin). The 
phage suspension was extracted with 1 volume of chloroform and further purified by 

25 

1 5 centrifiigation on a cesium chloride step gradient as described in Sambrook ei al. 
(1989), using a TLS 55 rotor and centrifuged in an Optima TLX ultracentrifuge 
(Beckman) for 2 hrs at 28,000 rpm (67,000 xg) at 4°C. Banded phage was collected 
and ultracentrifuged again on an isopyenic cesium chloride gradient (1.45 g/ml) at 
40,000 rpm (64,000 xg) for 24 hrs at 4°C using a TLV rotor (Beckman). The phages 
20 were harvested and dialyzed for 4 hrs at room temperature against 4 L of dialysis 
buffer consisting of 10 mM NaCl, 50 mM Tris-HCl [pH 8] and 10 mM MgCl 2 ..Phage 
35 DNA was prepared from the phage suspension by adding 20 mM EDTA, 50 g/ml 

Proteinase K and 0.5% SDS and incubating for 1 hr at 65°C, followed by successive 
extractions with 1 volume of phenol, 1 volume of phenol-chloroform and 1 volume of 
25 chloroform. The DNA was then dialyzed overnight at 4°C against 4 L of TE ( 1 0 mM 
40 Tris-HCl [pH 8.0], 1 mM EDTA). 

Example 1 2. DNA sequencing of the Bacteriophage 1 82 genome. 

Four micrograms of phage DNA was diluted in 200 ul of TE (10 mM Tris, 

45 

30 [pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an, ~" 
amplitude of 3 fim with bursts of 5 s spaced by 1 5 s cooling in ice/water for 3 to 4 
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cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acctatc, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 
5 instructions of the manufacturer (Qiagen), with a final elution of 50 \i\ of 1 mM Tris 
[pH8.5]. 

15 The ends of the sonicated DNA fragments were repaired with a combination of 

T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase 1, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 ul) 
10 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 10 mM 
MgCl 2 , 1 mM DTT, 50 ug/ml BSA, 100 uM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 12.5 
units of the Klenow large fragment of DNA polymerase I(New England Biolabs) for 
15 min at room temperature. The reaction was stopped by two phenol/chloroform 

25 

1 5 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 ul of H 2 0. 

Blunt-ended DNA fragments were cloned by ligation directly into the Mine II 
30 site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 

calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation reaction 
20 contained 100 ng of vector DNA, 2 to 5 ul of repaired sonicated phage DNA (50-100 
ng) in a final volume of 20 \i\ containing 800 units of T4 DNA ligase (New England 
Biolabs) and was incubated overnight at 16°C. Transformation and selection of 
bacterial clones containing recombinant plasmids was performed in E. coli DHlOp 
according to standard procedures (Sambrook et ai, 1989). 
25 Recombinant clones were picked from agar plates into 96-well plates 

containing 1 00 ul LB and 100 ug/ml ampicillin and incubated at 37°C. The presence 
of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc IT cloning site of the pKS vector. PCR amplification of the potential 
45 foreign inserts was performed in a 15 ul reaction volume containing 1 0 mM Tris (pH 

30 8.3), 50 mM KCl, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 fiM each dNTP,- 
and 0.75 units Tag polymerase (BRL). The thermocycling parameters were as 
follows: 2 rain initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
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denaturation at 94°C, 30 sec annealing at 58°C> and 2 min extension at 72°C, 
followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 
to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 

The nucleotide sequence of the extremities of each recombinant clone was 
determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer #403055)(M13REV 
primer: #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosysteras; #4303152). To ensure co-linearity of the sequence data and 
the genome, all regions of the phage genome were sequenced at least once from both 
directions on two separate clones. In areas that this criteria was not initially met, a 
sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism BigDye™ terminator cycle sequencing ready reaction 
kit. 

Example 13. Bioinfor^atic management of primary nucleotide sequence. 

Sequence contigs were assembled using Sequencher™ 3. 1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge of 
the contigs. Phage DNA was used directly as sequencing template employing ABI 
prism BigDye™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Enterococcus bacteriophage 1 82 is shown in 
Table 21. 

A software program was used on the assembled sequence of bacteriophage 182 
to identify all putative ORFs larger than 33 codons. The software scans the primary 
nucleotide sequence starting at nucleotide #1 for an appropriate start codon. Three 
possible selections can be made for defining the nature of the start codon; I) selection 
of ATG, H) selection of ATG or GTG, and III) selection of cither ATG, GTG, TTG, 
CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds to the one 
reported by the NCBirhtro://www.nchi.nlm nih.gov/htbin- 
post/Taxon bmv/wprintgc?mode=e) for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
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next stop codon downstream of it. If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage 182 are listed in Tables 22 & 23. 
Sequence homology searches for each ORF were carried out using an implementation 
of BLAST programs. Downloaded public databases used for sequence analysis 
include: 

(i) non-redundant GenBank (ftp://ncbi.nlmjim.gOv/blast/oVnr.Z), 

ii) Swissprot (mp://ncbi.nlm.nih.gov/blast/db/swissprot^); 

iii) vector (ftp^/ncbi.nlm.nih.gov/blast/db/vector^); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.gov^last/db/pdbaa^); 

v) staphylococcus aureus NCTC 8325 (ftp://fty.genome.ou.edu/pub/staph/staph- 
lk.fa); 

vi) streptococcus pyrogenes 

(r^V/ftp.tigr.org/pub/data/s_pneumoniae/gsp.contigs. 1 12197*2); 

vii) PRODOM 

( ftp://fm.toulouse.inra.fr/ nuh/nrodom/current release/prodom99. 1 .forblast.gzl : 

viii) DOMO (ftp;//ftp,infpbioRcn r fr^b/db/domo/); 

ix) TREMBL (ftp://www.expasy.ch/databases/spjr_mdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
182 are shown in Tables 24 & 26. 

Example 14. Sub-Cloning of pac|erjppha^e lffl ORFs. 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage 182 ORF sequence is inducible. 
For example, the plasmid pND50 replicates in E. coli, E.faecalis, and S, aureus 
(Yamagishi, J., Kojima, T., Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimocrob. Agents Chemother. 40, 1 1 57-1 163). This plasmid- " *" 
can be modified by conventional techniques to insert the inducible arsenite promoter, 
derived from the shuttle vector pT002l, in which the firefly luciferase (lucFF) 
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expression is controlled by the ars promoter/operator from a S. aureus plasmid 
(Tauriainen, S., Karp, M.. Chang, W and Virta, M. (1997). Recombinant luminescent 
bacteria for measuring bioavailable arsenite and antimonite. Appl. Environ. Microbiol. 
63:4456-4461 ). This modified shuttle vector will contain the ars promoter, arsR gene 
and a cloning site for introduction of individual phage ORFs downstream from a 
shine-delgamo sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 
transduction system that responds to the extracellular inducer nisin. The nisin 
sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 
species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Mana D, de Vos WM, Kuipers OP, Kleerebezera M. and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transciption in Enterococcus. 

Alternatively, a constitutive promoter can be used (e.g„ the p-lactamase 
promoter is constitutive in E.faecalis - see ref. 1) to drive expression of the 
introduced ORF, and compare cell growth to control bacterial cells containing the 
parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into Efaecalis strain FA2-2 by electroporation, as previously described 
(Y amagishi, J., Kojima, T., Oyamada, Y„ Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1157-1 163). 
Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgamo sequence are selected for functional analysis of 
bacterial killing. Each ORF, from initiation codon to last codon (excluding the stop 
codon), will be amplified by PCR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the sjpp* : " ~ 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
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the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligated into the modified shuttle vector, and used to transform bacterial 
strain DHlOp. Recombinant clones are then picked and their insert sizes confirmed by 
PCR analysis using primers flanking the cloning site as well as restriction digestion. 
The sequence fidelity of cloned ORFs will be verified by DNA sequencing using the 
same primers as used for PCR. In the cases mat the verification of ORFs can not be 
achieved by one path of sequencing using primers flanking the cloning site internal 
primers will be selected and used for sequencing. Recombinant plasmids will be 
introduced into E. faecalis strain FA2-2 by electroporation, as previously described 
(Yamagishi, J., Kojima, T. f Oyamada, Y., Fujimoto, K., Hattori, H., Nakamura, S., 
and Inoue, M. 1996. Antimicrob. Agents Chemother. 40, 1 157-1163). 
Induction of gene expression from the ars promoter. 

If an inducible promoter is used, e.g., the ars promoter, induction can be 
assessed, for example, in either of the two methods. 

1 . Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an aliquot 
of E, faecalis transformed cells containing phage 1 82 ORF onto agar plates containing 
different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 uM). The plates are 
incubated overnight at 37°C, after which a growth inhibition of the ORF 
transform ants on plates that contain arsenite are compared to plates without arsenite. 

2. Quantification of growth inhibition in liquid medium 

Cells containing different recombinant plasmids can be grown for overnight at 
37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (ODj«,=.2) with fresh media containing antibiotic 
and transferred to 96- well microtitration plates (100 u.l/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 uM) and the culture incubated 
for an additional 2 h at 37°C. The effect of expression of the phage 182 ORFs on 
bacterial cell growth is then monitored by measuring the OD^ and comparing the rate 
of growth to the culture not containing inducer. As positive controls for growth 
inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and - .. 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes T>f~the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
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Maier, SK. and Scherer, S. 1998. FEMS Microbiology Letters #162:265-274) were 
subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
colonies is counted. Any ORF showing bacteriostatic activity will show a lower, but 
detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 
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Example is, Growth of StrmtQQQwm bactgripphage Pp-1 Aid purification of 
35 genomic PNA- 

The Streptococcus pneumoniae R6 propagating strain (PS) (Tomasz, 
1966) was used as host to propagate its respective phage Dp-1 (McDonnell et al., 
25 1 975). (Alternatively, Streptococcus (Diplococcus) pneumoniae R36A could be used. 

40 Strain R36A is available from ATCC as #1 1733 or 27336. Streptococcus pneumoniae 

is also available from Felix d'Herelle Reference Center in Quebec, Canada as catalog 
number HER 1054. Other S. pneumoniae strains are also available from ATCC.) 
Two rounds of plaque purification of phage Dp-1 were performed on soft agar 

45 30 essentially as described in Sambrook et al ( 1 989). Briefly, the Streptococcus R6 PS 

strain was grown overnight at 37°C in K-Cat media [K-Cat:. 10 g Bacto casitone, 5 g 
Bacto tryptone, 1 g Yeast extract, 5g Potassium chloride, 0.2% Glucose, 30mM- 
Potassium phosphate buffer [pH 8] and 250,000 Units Catalase per liter (Boehringer 
Mannheim #1 0683600). The culture was then diluted 20 fold in K-CAT and 
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incubated at 37°C until the OD J40 = 0.2 (early log phase) with constant agitation, in 
order to obtain single plaques, Dp-1 phage was subjected to 10- fold serial dilutions 
using the phage buffer (100 mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM 
MgCl 3 )and 10 ul of each dilution was used to infect 0.5 ml of the cell suspension. 
After incubation of 15 min at 37°C, 2 ml of melted soft agar (K-CAT supplemented 
with 0.8% of agar) were added to the mixture and poured onto the surface of 100 mm 
K-CAT agar plates [K-CAT supplemented with 1.2 % of agar]. Aiter solidification of 
the soft agar layer, an additional 5 ml of melted soft agar was added to visualize 
distinct plaques (Ronda et al, 1978). After overnight incubation at 37°C, a single 
plaque was isolated, resuspended in 1 ml of phage buffer by end over end rotation for 
2 hrs at room temperature, and the phage suspension was diluted and used for a 
second infection as described above. After overnight incubation at 37°C, a single 
plaque was isolated and used as a stock for all subsequent manipulations. 

The propagation procedure for bacteriophage Dp-1 was modified from the 
agar layer method of Swanstorm and Adams (1951). Briefly, the R6 strain of 
Streptococcus pneumoniae was grown to stationary phase overnight at 37°C in K- 
CAT. The culture was then diluted 20 fold in K-CAT and incubated at 37 6 C until the 
ODj4o= 0.2. The suspension (15x1 0 7 Bacteria) was then mixed with 15x10 s plaque 
forming units (pfu) to give a ratio of 1 00-bacteria/pfu. After incubation of 1 5 min at 
37°C, 7.5 ml of melted soft agar (K-CAT plus 0.8% agar) were added to the mixture 
and poured onto the surface of 150 mm K-CAT agar plates and incubated 16 hrs at 
37°C. After solidification of the soft agar layer, 7.5 ml of melted soft agar were added 
to each plate. To collect the plate lysate, 20 ml of K-CAT media were added to each 
plate and the soft agar layers were collected by scrapping off with a clean microscope 
slide followed by vigorous shaking of the agar suspension for 5 min to break up the 
agar. The mixture was then centrifuged for 10 min at 4,000 rpm (2,830 xg) using a 
JA-10 rotor (Beckman) and the supernatant (lysate) was collected and subjected to a 
treatment with 10 ug /ml of DNase I and RNase A for 30 min at 37°C. To precipitate 
the phage particles, the phage suspension was adjusted to 10% (w/v) of PEG 8000 and 
0.5 M of NaCl followed by incubation at 4°C for 16 his. The phage was recovered by 
centrifugation at 4,000 rpm (3,500 xg) for 20 min at 4°C on a GS-6R table top 
centrifuge (Beckman). The pellet was resuspended with 2 ml of phage buffer (100 
mM Tris-HCl [pH 7.5], 100 mM NaCl and 10 mM MgClJ. The phage suspension 
was extracted with 1 volume of chloroform and further purified by centrifugation on a 
cesium chloride step gradient as described in Sambrook et ai. (1989), using a TLS-55 *' 
rotor and centrifuged in an Optima TLX ultracentrifuge (Beckman) for 2 hrs at 28,000 
rpm (67,000 xg) at 4°C. Banded phage was collected and ultracentrifuged again on an 
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isopycnic cesium chloride gradient (1 .45 g/ml) at 40,000 rpm (64,000 xg) for 24 hrs at 
4°C using a TLV rotor (Beckman). The phage was harvested and dialyzed for 4 hrs at 
room temperature against 4 L of dialysis buffer consisting of 10 mM NaCl, 50 mM 

10 Tris-HCl [pH 8] and 1 0 mM MgCl 2 . Phage DNA was prepared from the phage 

5 suspension by adding 20 mM EDTA, 50 ug/ml Proteinase K and 0.5% SDS and 
incubating for 1 hr at 65°C, followed by successive extractions with 1 volume of 
phenol, 1 volume of phenol-chloroform and 1 volume of chloroform. The DNA was 

15 then dialyzed overnight at 4°C against 4 L of TE (10 mM Tris-HCl [pH 8.0], ImM 

EDTA). 



25 



30 



45 



10 



Example 1 6. DNA sequencing of the Bacteriophage Dd-1 genome. 



20 Four micrograms of phage DNA was diluted in 200 uJ of TE (10 mM Tris, 

[pH 8.0], 1 mM EDTA) in a 1.5 ml eppendorf tube and sonication was performed 
(550 Sonic Dismembrator, Fisher Scientific). Samples were sonicated under an 
1 5 amplitude of 3 urn with bursts of 5 sec spaced by 1 5 sec cooling in ice/water for 3 to 4 
cycles. The sonicated DNA was then size fractionated by electrophoresis on 1% 
agarose gels utilizing TAE (1 x TAE is: 40 mM Tris-acetate, 1 mM EDTA [pH 8.0]) 
as the running buffer. Fractions ranging from 1 to 2 kbp were excised from the 
agarose gel and purified using a commercial DNA extraction system according to the 
20 instructions of the manufacturer (Qiagen), with a final elution of 50 ul of 1 mM Tris 
[pH8.5]. 

The ends of the sonicated DNA fragments were repaired with a combination of 
T4 DNA polymerase and the Klenow fragment of E. coli DNA polymerase I, as 
follows. Reactions were performed in a reaction mixture (final volume, 100 ul) 
35 25 containing sonicated phage DNA, 10 mM Tris-HCl [pH 8.0], 50 mM NaCl, 1 0 mM 

MgCl 2l I mM DTT, 50 ug/ml BSA, 100 uM of each dNTP and 15 units of T4 DNA 
polymerase (New England Biolabs) for 20 min at 12°C followed by addition of 1 2.5 
units of the Klenow large fragment of DNA polymerase 1 (New England Biolabs) for 
40 15 min at room temperature. The reaction was stopped by two phenol/chloroform 

30 extractions and the DNA was precipitated with ethanol and the final DNA pellet 
resuspended in 20 ul of Hfi. 

Blunt-ended DNA fragments were cloned by ligation directly into the Mine II 
site of the pKSII+ vector (New England Biolabs) dephosphorylated by treatment with 
calf intestinal alkaline phosphatase (New England Biolabs). A typical ligation 
35 reaction contained 100 ng of vector DNA, 2 to 5 ul of repaired sonicated phage DNA ~ 
(50- 1 00 ng) in a final volume of 20 ul containing 800 units of T4 DNA ligase (New 
England Biolabs) and was incubated overnight at 16°C. Transformation and selection 
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of bacterial clones containing recombinant plasmids was performed in £. coli DHlOp 
according to standard procedures (Sambrook el al, 1989). 

Recombinant clones were picked from agar plates into 96- well plates 
containing 100 p.1 LB and 100 ug/ml ampicillin and incubated at 37°C. The presence 
5 of phage DNA insert was confirmed by PCR amplification using T3 and T7 primers 
flanking the Hinc II cloning site of the pKS vector. PCR amplification of the potential 
foreign inserts was performed in a 15 ul reaction volume containing 10 mM Tris (pH 
15 8.3), 50 mM KC1, 1.5 mM MgCl 2 , 0.02% gelatin, 1 uM primer, 187.5 uM each dNTP, 

and 0.75 units Tag polymerase (BRL). The therraocycling parameters were as 
10 follows: 2 min initial denaturation at 94°C for 2 min, followed by 20 cycles of 30 sec 
denaturation at 94°C, 30 sec annealirig at 58°C, and 2 min extension at 72°C, 
20 followed by a single extension step at 72°C for 10 min. Clones with insert sizes of 1 

to 2 kbp were selected and plasmid DNA was prepared from the selected clones using 
the QIAprep™ spin miniprep kit (Qiagen). 
1 5 The nucleotide sequence of the extremities of each recombinant clone was 

determined using an ABI 377-36 automated sequencer with two types of chemistry: 
ABI prism Big Dye™ primer cycle sequencing (21M13 primer #403055)(M13REV 
primer #403056) or ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit (Applied Biosystems; #4303 152). To ensure co-linearity of the sequence data and 
20 the genome, all regions of the phage genome were sequenced at least once from both 
30 directions on two separate clones. In areas that this criteria was not initially met, a 

sequencing primer was selected and phage DNA was used directly as sequencing 
template employing ABI prism Big Dye™ terminator cycle sequencing ready reaction 
kit. 

35 25 

Example 1 7. Bioinformatic management <?f primary nucleotide sequence, 

Sequence contigs were assembled using Sequencher™ 3.1 software 
(GeneCodes). To close contig gaps, sequencing primers were selected near the edge 
40 of the contigs. Phage DNA was used directly as sequencing template employing ABI 

30 prism BigDyc™ terminator cycle sequencing ready reaction kit (Applied Biosystems; 
#4303152). The complete sequence of Streptococcus bacteriophage Dp-1 is shown in 
Table 28. 

A software program was used on the assembled sequence of bacteriophage 
Dp-1 to identify all putative ORFs larger than 33 codons. The software scans the 
35 primary nucleotide sequence starting at nucleotide #1 for an appropriate start co^onr " * 
Three possible selections can be made for defining the nature of the start codon; I) 
selecuon of ATG, II) selection of ATG or GTG, and in) selection of either ATG, 
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GTG, TTG, CTG, ATT, ATC, and ATA. This latter initiation codon set corresponds 
to the one reported by the NCB If http://www.ncbi. nlm.nih.gov/htbin- 
DOSt/Taxonomv/wprimgc?mode^ for the bacterial genetic code. When an 
appropriate start codon is encountered, a counting mechanism is employed to count 
the number of codons (groups of three nucleotides) between this start codon and the 
next stop codon downstream of it If a threshold value of 33 is reached, or exceeded, 
then the sequence encompassed by these two codons is defined as an ORF. This 
procedure is repeated, each time starting at the next nucleotide following the previous 
stop codon found, in order to identify all the other putative ORFs. The scan is 
performed on all three reading frames of both DNA strands of the phage sequence. 
The predicted ORFs for bacteriophage Dp-1 are listed in Tables 29 and 30, and Fig. 6. 

Sequence homology searches for each ORF were carried out using an 
implementation of BLAST programs. Downloaded public databases used for 
sequence analysis include: 

(i) non-redundant GenBank (ftpy/ncbi.nlm.nih.gov^last/db/n^^), 

ii) Swissprot (^py/ncbi.nlm.nin.gov^last/db/swissp^ot^); . 

iii) vector (ftp://ncbi.nlm.nih.gOv/blast/db/vector.Z); 

iv) pdbaa databases (ftp://ncbi.nlm.nih.g0v/blast/db/pdbaa.Z); 

v) staphylococcus aureus NCTC 8325 
(ftp^/ftp.genome.ou.edu/pub/staph/staph- 1 k.fa); 

vi) streptococcus pyogenes 
(f^://fhp.tigr.org/pub/dma/s_pneumoniae/gsp.contigs.l 12 197.Z); 

vii) PRODOM 

( fb://fh).toulouse.inra.fr/nub/Drodom/cuTrent release/pmdom99. 1 .foihlast.gz): 

viii) DOMO (ftD://ftD.infobiogen.fr/Dub/db/domo/^: 

ix) TREMBL (ftp://ww\v.expasy.ch/databases/sp_tr_nrdb/fasta/) 

The results of the homology searches performed on the ORFs of bacteriophage 
Dp- 1 are shown in Table 3 1 . 

Example Kg. $ub-C|onipq of Bacteriophage Dp- 1 ORFs. 
Preparation of the shuttle expression vector 

Expression preferably utilizes a shuttle expression vector which is arranged 
such that expression of the exogenous bacteriophage Dp- 1 ORF sequence is inducible. . . 
For example, the plasmid pLSE4 replicates in E. coli, and S. pneumoniae (Diamond 
Garcia, 1990). This plasmid can be modified by conventional techniques to insert the 
inducible arsenite promoter, derived from the shuttle vector pT002 1 , in which the 
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firefly hiciferase (lucFF) expression is controlled by the ars promoter/operator firom a 
S. aureus plasmid (Tauriaincn, S., Karp, M., Chang, W and Virta, M. (1997). 
Recombinant luminescent bacteria for measuring bioavailable arsenite and antimonite. 
Appl. Environ. Microbiol. 63:4456-4461). This modified shuttle vector will contain 
the ars promoter, arsR gene and a cloning site for introduction of individual phage 
ORFs downstream from a shine-dalgarno sequence. 

Other inducible regulatory sequences can be utilized instead of the arsenite- 
inducible system. An example is a nisin-inducible system The nisA promoter activity 
is dependent on the proteins NisR and NisK, which constitute a two-component signal 
transduction system that responds to the extracellular inducer nisin. The nisin 
sensitivity and inducer concentration required for maximal induction varies among the 
strains, but is functional in Streptococcus pyogenes, Streptococcus agalactiae, 
Streptococcus pneumoniae, Enterococcus faecalis, and Bacillus subtilis. Significant 
induction of the nisA promoter (10- to 60-fold induction) can be obtained in all of the 
species. A vector containing this promoter was published as Eichenbaum Z, Federle 
MJ, Marra D, de Vos WM, Kuipers OP, Kleerebezem M, and Scott JR (1998) Appl 
Environ Microbiol 64, 2763-2769. Other vectors, e.g., plasmids, can also be utilized 
which will allow replication and transcription in Streptococcus. 

Alternatively, a constitutive promoter can be used to drive expression 
of the introduced ORF, and compare cell growth to control bacterial cells containing 
the parental vector lacking any introduced phage ORF. Recombinant plasmids are 
introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1990) 

Cloning of ORFs with a Shine-Dalgarno sequence 

ORFs with a Shine-Dalgarno sequence are selected for functional analysis of 
bacterial killing. Each ORF, from initiation codon to last cod on (excluding the stop 
codon), will be amplified by PGR from phage genomic DNA. For PCR amplification 
of ORFs, each sense strand primer starts at the initiation codon and is preceded by a 
restriction site and each antisense strand starts at the last codon (excluding the stop 
codon) and is preceded by a different restriction site. The PCR product of each ORF 
will be gel purified and digested with the restriction enzymes with sites contained on 
the PCR oligonucleotides. The digested PCR product is then gel purified using the 
Qiagen kit, ligatcd into the modified shuttle vector, and used to transform bacterial 
strain DH I Op. Recombinant clones are then picked and their insert sizes confirmed 
by PCR analysis using primers flanking the cloning site as well as restriction- - 
digestion. The sequence fidelity of cloned ORFs will be verified by DNA sequencing 
using the same primers as used for PCR. In the cases that the verification of ORFs 
can not be achieved by one path of sequencing using primers flanking the cloning site 
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internal primers will be selected and used for sequencing. Recombinant plasmids will 
be introduced into S. pneumoniae R6 as previously described (Diaz and Garcia, 1 990). 
Induction of gene expression from the ars promoter. 
10 If an inducible promoter is used, e.g., the ars promoter, induction can be 

5 assessed, for example, in either of the two methods. 

1. Screening on agar plates 

The functional identification of killer ORFs can be performed by spreading an 
15 aliquot of S. pneumoniae transformed cells containing phage Dp- 1 ORFs onto agar 

plates containing different concentrations of sodium arsenite (0; 2.5; 5; and 7.5 uM). 
10 The plates are incubated overnight at 37°C, after which a growth inhibition of the 
ORF transformants on plates that contain arsenite are compared to plates without 
arsenite. 

2. Quantification of yrowth inhibition in liquid medium 
Cells containing different recombinant plasmids can be grown for overnight at 

1 5 37°C in LB medium supplemented with the appropriate antibiotic selection. These are 
then diluted to the mid log phase (OD 340 =.2) with fresh media containing antibiotic 
and transferred to 96- well microtitration plates (100 ul/well). Inducer is then added at 
different final concentrations (ranging from 2.5 to 10 uM) and the culture incubated 
for an additional 2 hrs at 37°C. The effect of expression of the phage Dp-1 ORFs on 
20 bacterial cell growth is then monitored by measuring the OD S40 and comparing the rate 
30 of growth to the culture not containing inducer. [As positive controls for growth 

inhibition, the kilA gene of phage lambda (Reisinger, GR., Rietsch, A., Lubitz, W. and 
Blasi, U. 1993 Virology #193: 1033-1036), and the holin/lysin genes of the 
Staphylococcus aureus phage Twort (Loessner, MJ., Gaeng, S., Wendlinger, G., 
35 25 Maicr, SK. and Schcrcr, S. 1998. FEMS Microbiology Letters #162:265-274) can be 

subcloned into the ars inducible vector. An aliquot of the induced and uninduced 
culture can also be plated out on agar plates containing an appropriate antibiotic 
selection but lacking inducer. Following incubation overnight at 37°C, the number of 
colonics is counted. Any ORF showing bacteriostatic activity will show a lower, but 
30 detectable, number of colonies on the agar plates when grown in the presence of 
inducer as compared to when grown in the absence of inducer. Any ORF showing 
full bacteriocidal activity will show no colonies on the agar plates, when grown in the 
presence of inducer as compared to when grown in the absence of inducer. 
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All patents and publications mentioned in the specification are indicative of 
10 the levels of skill of those skilled in the art to which the invention pertains. All 

references cited in this disclosure are incorporated by reference to the same extent as 
if each reference had been incorporated by reference in its entirety individually. 

One skilled in the art would readily appreciate that the present invention is 
well adapted to carry out the objects and obtain the ends and advantages mentioned, 
1 5 as well as those inherent therein. The specific methods and compositions described 
herein as presently representative of preferred embodiments are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein and other uses 
25 will occur to those skilled in the art which are encompassed within the spirit of the 

invention are defined by the scope of the claims. 
20 It will be readily apparent to one skilled in the art that varying substitutions 

and modifications may be made to the invention disclosed herein without departing 
30 from the scope and spirit of the invention. For example, those skilled in the art will 

recognize that the invention may suitably be practiced using a variety of different 
bacteria, bacteriophage, and sequencing methods within the general descriptions 
25 provided. 

The invention illustratively described herein suitably may be practiced in the 
absence of any clement or elements, limitation or limitations which is not specifically 
disclosed herein. Thus, for example, in each instance herein any of the terms 
"comprising," "consisting essentially of* and "consisting of* may be replaced with 
30 either of the other two terms. The terms and expressions which have been employed 
are used as terms of description and not of limitation, and there is not intention that in 
the use of such terms and expressions of excluding any equivalents of the features 
shown and described or portions thereof, but it is recognized that various 
modifications are possible within the scope of the invention claimed. Thus, it should 
45 35 be understood that although the present invention has been specifically disclosed by 

preferred embodiments and optional features, modification and variation of the 

concepts herein disclosed may be resorted to by those skilled in the art, and thaPsuch 
modifications and variations are considered to be within the scope of this invention as 
50 defined by the appended claims. 
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In addition, where features or aspects of the invention are described in terms of 
Markush groups or other grouping of alternatives, those skilled in the art will 
recognize that the invention is also thereby described in terms of any individual 
member or subgroup of members of the Markush group or other group. For example, 
if there are alternatives A, B, and C, all of the following possibilities are included: A 
separately, B separately, C separately, A and B, A and C, B and C, and A and B and 
C. Thus, for example, for the bacteria and phage specified herein, the embodiments 
expressly include any subset or subgroup of those bacteria and/or phage. While each 
such subset or subgroup could be listed separately, for the sake of brevity, such a 
listing is replaced by the present description. 

Thus, additional embodiments are within the scope of the invention and within 
the following claims. 
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Table 1 



10 

Phages against human and animal pathogenic bacteria 

5 



/5 


I. Pathogen 
name 


Phage name 


H. Cat 
alo 

e# 


Origin/reference 


20 


Actnetobacter 
calcoQceticus 


A3/2 

A 10/45 

A36 

B9GP 

B9PP 

BS46 

E13 




Felix d' Here lie Reference 
Centre, Quebec, Quebec 


25 




E14 
531 










P78 




j. oacicnoi 170**. id/. 1 /y-ioj 

J. Gen. Microbiol 1986.132: 2633-2636 




Actnetobacter 
haemofyticus 






Felix d'Herelle Reference 
Centre,Quebec,Quebcc 


30 










35 


Acineiobacter 
johnsonii 






Felix d'Herelle Reference 
Centre.Quebec.Quebec 


Actnetobacter sp. 


BP1 




J.ViroU968.2:716-722 






G4.HP2.HP3& 
HP4 




ConJJvlicrobioI. 1966. 12:1023-1030 & 

J.ViroU974.l3:46-52& 

Arch.ViroU994.135:345-354 


40 




A1.A4.A9& 
196 




Arch. Virol. 1 994. 1 35:345-354 






HP1 




OmJ.Microbiol.l966.12:1023-1030 


45 




A 19, A23, A29, 
A31,A33,A34, 
A3759&2845 




J.Mtcrosc (Paris) 1973.16:215-224 & 
CR.Hebdo Seances Acad.Sci.SerD.Sci 
Natur(Paris)278:1907-1909 & 
Arch.Virol.l994.135:345-354 & 
Rev.Con.Biol. 1970.29:3 1 7-320 




A ctinobacillus 
actinomycetecomitans 






FEMS Microbiol Lett 1994. 1 19:329-337^. - 
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Infec. Imraua 1982. 35: 343-349 








Mol.Geu.Gcnet 1998.258: 323-325 




A*p247 




Oral Mtcriol. Immunol 1997.12: 40-46 


Actinomyces viscosus 




43146-B1 


The American Type Culture Collection 








InfecttamunJ 985.48:228-233 








f nfjlj 1 lull HULL IOOC CA.C/I CO 

unecuinunun. 1900.30:3409 








Phsmid 1997.37:141-153 


Aeromonas hydrophila 


PM2** &PM3 




FEMS Microbiol.Lett. 1990.57:277-282 ! 




Aehl 




Felix d'Herelle Reference 




Aeh2 




Centre.Qucbec.Quebec 




PM4 








PM5 








PM6 








T7-ah 
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10 
15 


Aeromonas 
salmonicida 


3 

25 
29 
31 
32 

40RRz.it 

43 

51 

56 

59.1 

65 

Asd37 




Felix d'Hcrelle Reference 
Cc nrrc.Quc be c, Quebec 






55R.I 




Can. J. Microbiol. 19S3. 29: 1458-1461 




Alteromonas espejiana 


PM2** 


27025-B1 


The American Type Culture Collection 


20 


Asticacaulis 
biprosthecum 






Felix d'Herelle Reference 
Ccntre,Quebec,Quebec 




Asticcacautis 
excentricus 




15261-B1 
15261-B2 
15261-B3 


The American Type Culture Collection 


25 




$Ac21 
6Ac24 






30 


Azotobacter vinelandii 


A21 
A3I 
A41 
PA VI 


12518-Bl 

12518-B4 

12518-B5 

12518-B9 

12518-B10 

13705-B1 


The American Type Culture Collection 


35 


Azotobacter sp. 






Virology 1972.49:439-452 




Bacteroides Jragilis 


Bf-1 




Rev. Infect Dis. 1 979. 1 : 325-336 






B40-8 




FEMS Microbiol. Lett 1991. 66: 61-67 






HSP40 




Appl. Environ. Microbiol 1989. 55: 2696- 
2701 


40 




phiAl 




Zentralbl.bakteriol. 1 972.222:57-63 




Bdellovibrio 
bacteriovorus 


MAC-1 




J. Gen. Microbiol. 1 987. 1 33: 3065-3070 




Bdellovibrio sp. 


VL-1 




J.ViroM973.12:1522-I533 




Bordetella 
brochiseptica 


214 




Zh^ikrobioLEpidenuoLIrnTnuno. 1987.5:9- 
13 



55 



5 


WO 00/32825 




115 


PCT/IB99/02040 




Bordetella 
parapertussis 






Felix d'Hcrclle Reference 
Ccntrc.Qucbcc.Quebec 


10 










15 








Mol. Gen. Mikrobiol. Virusol. 1988.4: 22-25 










Zh.Mikrobiol£pidenuol.Immuno. 1987.5:9- 
13 


20 








Zh.Mikrobiol.£pidexnioUnimuno. 1987.5:9- 
13 




Brucella abortus 






Felix d'Herclle Reference 
Centre.Quebec,Quebec 


25 
30 
35 












10/1 
24/11 
212/XV 


23448-B1 
23448-B2 
23448-B3 
17385-B1 
1 7385-B2 


The American Type Culture Collection 


40 










45 




BK-2.TB & 
Fi** 




Zh.M ilcrobi o l.Epidemio 1 Jmmunobiol. 1 983 2 : 
48-52 






R/c&R/O 




Dev. Biol. Stand. 1984.56: 55-62 .... 




Brucella canis 


R/c 




Dev. Biol. Stand. 1984.56: 55-62 




Brucella melitensls 


BK-2 


23456-B1 


The American Type Culture Collection 


50 


Brucella suis 


Wb 




Zeimai.Vcterinarmed. 1 975.22:866-867 



55 



WO 00/32825 



PCI7TB99/02040 



116 





Fi** &TB 




ZkMikrobiol . Ep idemioL ImmunobioL 1 983 2 : 








48-52 


Brucellas p. 






Con. J. Vet. Res. 1989.53: 319-325 








Res. Vet Sci. 1988.44:45-49 




R 












48 


Campylobacter colt 




43133-B1 


The American Type Culture Collection 






43134-B1 




Campylobacter colt 


18 


43135-B1 


The American Type Culture Collection 


(Cont'd) 


19 


43136-Bl 






20 






Campylobacter jejuni 


1 


35918-Bl 


The American Type Culture Collection 


2 


35919-B1 






3 


35920-B1 






4 


35921-B1 






5 


35918-B2 






6 


35920-B2 






7 


35922-B2 






8 


35923-B1 






9 


35924-B1 






10 


35925-B1 






11 


35925-B2 






12 


35922-B2 






13 


35924-B2 






14 


35922-B3 






17 


43133-B1 






18 


43134-B1 






19 


43135-B1 






20 


43136-Bl 




Campylobacter 


HP1 




J. Med. MiciobioL1993. 38: 245-249 


(Helicobacter) pylori 








Chlamydia psittaci 


Choi** 




J. Gen. Virol. 1989. 70: 3381-3390 


Clostridium 


CAK-1 




J.BacterioL1993.175:3838-3843 


acetoburylicum 









WO 00/32825 



PCT/IB99/02040 



10 



15 



30 



Clostridium botulirtum 






Nucleic Acids Res. 1 990. 18:1291 








Bioch3iophys.rcs.Connnun. 1 990. 1 7 1 . 1 304- 
1311 








Microbiol.immunol. 198 1.25:9 15-927 








T \f»» KAaA c<%: torn C4.£*7< AOA 




CE 3 & CE y 






Qostridium difficile 


41&56 




J. Clini.Microbiol. 1985.21:251-254 



40 



45 



55 



WO 00/32825 



PCT/IB99/02040 



118 



Clostridium 






Rcv.Can.Biol. 1 97736:205-2 15 


perfringens 














rbMb Microbiol. Lett 19yU.34:323-J2o 


Clostridium 




8074-B1 


The American Type Culture Collection 


sporogenes 


59 


I7886-B1 






70 


17886-B3 






71 


17886-B4 






72S 


I7886-B5 






72L 


I7886-B6 




Clostridium tetani 


A&B 




Rev.Can.BioU978.37:43-46 


Corynebacterium 






Vopr.VirusoL1986.31;577-584 


diphteriae 








Corynebacterium 


NN 


12319-B1 


The American Type Culture Collection 


pseudotuberculosis 








Corynebacterium sp 


DLC 2921/49 


12052-B1 


The American Type Culture Collection 



WO 00/32825 



PCT/IB99/02040 



10 



Enterococcus faecalis 


42 


19948-B1 


The American Type Culture Collection 


Enterococcus faecium 


124 

133 


19950-B1 
I9953-b2 
19953-B1 


The American Type Culture Collection 



15 



20 



25 



30 



35 



40 



45 



55 



WO 00/32825 



PCT/IB99/02040 



120 





Escherichia coli 




11303-B14 
U303-B10 
U303-B21 


The American Type Culture Collection 


10 






8677-B1 








1 1303-B13 
I3706-B4 






Escherichia coli 




15766-B1 


The American Type Culture Collection 




(Cont'd) 




15766-B1 
1242-B5 


15 






15669-B2 








I5767-B1 

11303-B16 

27-65-B1 








C204 
El 

n** 


25065-B2 
15669-B1 




20 




15597-B1 








a** 

FCZ 
fd** 


21816-B1 








23724-B9 
15593-B1 








25404-B! 
29746-B1 




25 






23631-B1 
25868-B1 
25298-B1 
25298-B2 
113Q3-B37 




30 






1 1303-B24 






Ifl" 


11303-B26 








1 1303-B27 
11303-B28 
11303-B29 
U303-B30 




JO 






11303-B33 
11303-B31 
11303-B25 
U303-B35 








MS2** 

MU9 

Mu-1 

0x6 

PI** 

P4 sidi 

R17** 

Z1K/1 
ZJ/2 


U303-B34 




40 




U303-B36 
U303-B32 
13706-B5 
113G3-B1 








11303-B2 
11303-B3 




45 




U303-B4 
35O60-B1 








35060-B2 








35060-B3 
11303-BJ 
11303-B6 




50 






11303-B7 








11303-B38 
1214UB1 





55 



WO 00/32825 



PCT/IB99/02040 





Escherichia coii 




11303-B20 


The American Type Culture Collection 




(Cont'd) 




11303-B17 
1I303-B15 


m 
15 




UVl 

UV47 

UV375 

GU 
K 

AH/ 
A. SUS r-3 
X sus R-5 
Xws J-6 
X sua 0-8 


11303-Bt] 

U303-B18 

13706-B2 

23724-B2 

23724-BI 

23724-B3 

23724-B4 

23724-B5 

23724-B6 

23724-B7 

23724-B8 

35860-Bl 




20 




XsosA-U 
X ind' 


13706-B3 
15597-B2 
13706-B1 
49696-B1 




25 




0C174** . 

$Xcs70am-3 










G4** & 4C** 




Biochim-Biophysica Acta.1992.1 130:277-288 






BF23** 




J.Bacteriol . 1977. 129:265-275 






Mul 




J.Ultrastruct Res. 1966. 14:44 1-448 






Hpl7 




J.Mol.Biol. 1 99 1 2 1 8:705-72 1 


30 




K3** &Ox2** 




FEBS Lett 1987 2 15: 145-1 50 






Rbi8**,Rb51 & 




J.Bacteriol. 1 990. 172: 1 80- 1 86 






Rb69**' 










HI**, H3, H8, 




Mol.Gen.Genet. 1 990.22 1 :49 1 -494 






K9, 






35 




K18 & Oxl 








Ml**,TuIa**& 
Tulb** 




J.Mol.BtoU987. 196: 165-1 74 






K10 




J.BacterioL1979.140:680-686 






Osr* 




J.Bacteriol. 1 985. 1 62:256-262 






B278 




J.Gen.Microbiol. 1 988. 1 34 : 1333- 1 33 8 


40 




phi 80** 




FEMS MicrobioLLett. 1994.1 19:71-76 




phiml73 




Genetika 1985.21:673-675 






tf-1 




J.Gen.Microbiol.l987.133:953-960 






P4&phiR73 




MoLMicrobiol. 1995 .18:201 -208 






M 




J.Gen.Microbiol.l982.128:2797-2804 






PRD1 




Virology 1990.177:445-451 


45 




K3hx 




Mol.Gen.Genet. 1 987.206: 110-115 






933J**& 
933W*« 




lnfealininunity.l986.53:I35-140 ' * 






H19-B** 




J3acteriol. 1 987. 1 69:4308^3 12 


50 




Tcp-Ul 




Zentralbnl.BakterioLMDcrobioliIyg.1988.270: 
41-51 



55 



WO 00/32825 



PCT/IB99/02040 



122 



10 



15 



20 



25 



30 



Escherichia coli 
(Cont'd) 



N4* 



Phi 80 trp 



Obcta 1 



P1CM 



PA-2* 



186** 



186.IX.B 



21* 



P4* 



82* 



PSP3 



HK022* 



D108* 



RM9 



Ike* 



P22dis 



Nil* 



in* 



Stx2Phi-I & 
Stx2Phi-II 



18 



AC3 



VeLMtcrobiol. 1 992.30:203-2 12 



Ann.Inst.PastciiT. 1971.1 20:121-125 



J.BacterioU978.133:172-177 



J.Gcn.Microbiol. 1978. 1 07:73-83 



J.Bacteriol.l990.172:166(>.1662 



Mol.Gen.Genet. 1 982. 1 87:87-95 



Mol.Microbiol. 1992.6:2629-2642 



Virology 1983.129:484-489 



MicrobiolRe v. 1 993 .57:683-702 



J.Biol.Chem.l987.262:t 1721-1 1725 



J.Bacteriol. 1 996. 1 78:5668-5675 



Nucleic Acids Res.1 994.22:354-356 



Nucleic Acids Res. 1986.14:3813-3825 



J.Mol.BioL 1 997.267:237-249 



J.Mol.BioL1985.181:27.39 



Mol.Gen.Genctl978.166:233-243 



J.Bacteriol.l996.178:1484-1486 



PTOC.R.Soc.Lond.B.Biol^ci. 1991 .245:23-30 



Inf cct Immun. 1 998 . 66 :4 1 00-4 1 0 7 



Virology 1987.156:122-126 



J.Gcn.MicrobioU981.126:389-396 



Mol.Microbiol. 1 99 1 .5:7 1 5-725 



35 



40 



45 



50 



55 



WO 00/32825 



PO71B99/02040 



10 



15 



25 



30 



35 



45 



50 





BW-1 




Felix d'Herelle Reference 




C-l 




Centre,Quebec,Quebec 




E920g 
Esc-7-U 








H19J 








Haiti 








HK243 








la 








K20 








K30 








KL 3 








M 








Mu" 








O103 








OI57:H7 








P1D 








ptl 








PilHa 








PR64FS 








PR772 








SS4 
(J4Q 








\vv* m 








Q8 








09-1 








92 






Haemophilus 


hpi** 




Nucleic Acids Res. 1996.24:2360-2368 


influenzae 


S2** 




Gene 1997. 196: 139-144 


Holobacterium 


S45 




Felix d'Herellc Reference 


cutirubrum 






Centre.QuebecQuebec 


Holobacterium 






Felix d'Herelle Reference 


halobium 






Centre.QuebecQuebec 








Can J.Microbiol. 1 982.28:9 1 6-92 1 


Holobacterium 






Biol.Cfaem.Hoppe Seyler 1994475:747-757 


salinarium 









55 



WO 00/32825 



PCT/IB99/02040 





iCt*>h*ii>Ha nrvfflca 


tf -i i 


J.Gcn.MicrobioL1987.133:953-960 ! 




Klebsiella pneumoniae 


60 
92 


23356- B1 

23357- B1 


The American Type Culture Collection 


10 




K19Q 




Felix cTHereile Reference | 
CentrcOuebeCjpuebec 






FC3-1 &FC3-9 




Can J.Mictodiol l y y l . J / .t /u-z /a 






FC3-10 




r tmo MICTOD10Ll.e[l.l77l.u/.£7 i-zyi 




Klebsiella sp. 


K11** 




MoLGcn.Genet. 1990.221:283-286 


15 


Leptospira sp. 


LEI, LE3 & LE4 




Res.Microbiol. 1 990. 1 4 1 : 1 1 3 1 - 1 138 




Listeria 


243 


23074-B1 


The American Type Culture Collection 


20 


monocytogenes 


197,1313 & 
9425 I 




Appl£oviron.MicrobioL 1997.63:3374-3377 






H387&H387-A 




AppJ-Environ JVlicrobiol. 1 993 .59 :29 1 4-29 1 7 






5775,6223 
&12682 




APMIS. 1993. 101: 160- 167 


25 




2389, 2671, 
d21 1 A 2685 




lntervirology 1994.37:31-35 & 
2entraIbl.BakteriolJvlikTobiol.Hyg. 1986.26 1:1 
2-28 






4b.4ab.4R & 3c 




Ara.Microbiol (Paris) 1977.128:185-198 






A118, A500& 
A5I1" 




MoLMicrobiol. 1995.16:1231-1241-992 


30 




1,3,4,5, 6, 7, 8, 
9,10,11,14,15, 
16.17, 19&20 




AnaMicrobiol. (Paris) 1979. 130B: 179- 189 






l/2a, l/2b, 3c, 
4ab. 6a & 6b 




Clin.Invest.Med.l984.7:229-232 






$LMUP35 
2685 




Felix d'Herelle Reference 
Centre,Quebec,Quebec 


35 


Listeria innocua 


4211 




Felix d'Herelle Reference 
Centre,0uebec.0uebec 


40 


Micrococcus luteus 


N3 
N4 
N8 


4698-B1 
4698-B4 
4698-2 
4698-B3 


The American Type Culture Collection 


Micrococcus luteus 


N17 




CanJ.Microbiol. 1979.25:1027-1035 


45 


Mycobacterium 
smegmatis 


BK-3 
Bol" 
Bo 6 
Bo 611 
Bo 6111 
Mc-2 
Mc-4 
NN 

Phagus lacticola 
Rl 


27203- Bl 

27204- Bl 

27205- Bl 
27205-B2 
27205-B3 
607-B6 
607-B7 
11727-BI 
11759-Bl 
607-B1 


The American Type Culture Collection 

- - 



55 



WO 00/32825 



PCT/1B99/02040 



10 






HER317 
HER 330 
HER 333 
HER 335 
HER 334 
HER 331 
HER 316 


Felix d'Herelle Refrence 
Ccntrc,Quebec,Quebcc 


15 










20 




Legcndrc 
Leo 
Roy 
Sedge 






25 








MoLMicrobiol. 1993.7:395-405 










J.Mol.BioL 1998.279: 143- 1 64 


30 


















ProcJJatl.Acad.Sci USA. 1988.84:2833.2837 










Mol.Biol.Rep. 1981.30:11-15 


35 








Proc.Natl.Acad.Sci.USA 1997.94:10961- 
10966 


40 




29M,31M,122, 
154,37, 29D, 46, 
139,110, 141, 
74D, AGI& 
DS6A 




Arch.ViroL1993.133:39-49 & 
Am.Rev.Respir.Dis. 1 975. 1 1 2: 1 7-22 




Mycobacterium 
fortuitum 


Bo 4 
Bo 7 


23052-B1 
27207-Bl 
27207-B2 


The American Type Culture Collection 



45 



55 



WO 00/32825 



PCT/IB99/02040 



126 



Mycobacterium leprae 






AniLMicrobiol. (Paris) 1982. 133:93-97 


Mycobacterium 
tuberculosis 


DS6A 


25618-B1 
25618-B2 
4243-B1 


The American Type Culture Collection 


110, 139&33D 




Arch.ViroU993.133:39-49 


AG1.GS4E, 
BGI, 

PH & BKl 




The Biology of Mycobacteria .Academic 
Press.Toronto 1982 (Ratledgc Sc. Stanford) 
1982.309-351 


Mycobacterium sp 


Phagus pelleghni 

NN 

Bl 


11760- B1 

11761- B1 
23239-BI 


The American Type Collection Culture 



WO 00/32S2S 



PCT/1B99/02040 



127 





TM4, ph60, 

ph72, 

PhAE39, 

phAE40 

&Bxbl 




Microbiology 1995.141:1173-1181 




C2 




Expcremia 1969.25: 1112-1113 




18 & 115 




J.Gen. Virol. 1987.68:949-956 




63 




Gnizlica 1968.36:617-622 




phlei & 
butyricum 




J.Gen. Viiol.l975.29:235-238 




MyF3P-59a 




Z.AIlg.MikTobiol. 1968.8:29-37 




Bo2a 




J.Gen.Virol. 1973.20:75-87 








J.CXpil.IVICU. J 700. IXJ.Ji J-J*tV 




up 




j.Dacrenoi. i yo.>.oo.ouo-ouy 


Mycobacterium 


B5 


15483-B1 


The American Type Culture Collection 


Mycobacterium phlei 


NN 
Bo 2 
Bo2h 
Bo 3 


U728-B1 
11758-Bl 
27086-B2 
27086-B1 


The American Type Culture Collection 


Mycoplasma 
arthritidis 


MAV1** 




Infect Immuniry. 1 995.63:40 1 6-4023 


Mycoplasma hyorhinis 


Hr-1 




Arch.ViroU983.77:81-85 


Mycoplasma 
pneumoniae 


Br-1 




Arch.Viroi.l983.75:l-15 


Mycoplasma pulmonis 






Plasmidl995. 33:41-49 


Mycoplasma sp. 






J.Gen.MicrobioU985:13l:3117-3126 








J. Virol.l986.59:584-590 








Gene 1994. 141: 1-8 



WO 00/32825 



PCT/IB99/02040 



128 



10 



15 



20 



25 



30 



35 



40 



45 







Mirmhifw 1QQO 6d-l11.l2f 






Infection^ Immunity 1995. 63: 4016-4023 






M kLB io L 1982 .60:1 16-120 


MV-L2& 




Arch. Virol. 1 979. 6 1 :289-296 






Acta. Virol. 1978.22:443-450 






J.Gcn.Virol.l979.42:3l5-322 






Virology 1973.55:118-126 



50 



55 



WO 00/32825 



PCT/IB99/02040 



10 



15 



20 









Science 1971.173:725-727 


Neisseria per/Java 






J.CUn.Microbiol.1976. 4:87-91 


Nocardia erythrypolis 


<pC 




J.Gen. Virol. 1974.23:247-254 


<pEC 




J.BacterioU976.126:1104-l 107 


Pasteurella multocida 


B22S 




Arch.Exp.VetCTinarmed. 198 1 .35:433-436 


B939a 




Am.J.VetRes.l978.39:1565-1566 




Nos.115, 32,967 
& 

1075 




VetMed.Nauki. 1977.14:33-36 


Propionibacterium 
acnes 


NN 


29399-B1 


The American Type Collection Culture 



40 



45 



55 



5 



WO 00/32825 



130 



PCT/IB99/02040 



10 



15 



20 



25 



30 



35 



40 



45 



Pseudomonns 
aeruginosa 





line ui 


i 


12175-B2 


2A 


12175-B3 


2fi 


1 2 1 75-B4 


1 1 


lt£UJ*0 1 


16 


1 H&UI/"0 I 


z** 


14207-B1 


27 


14208-BI 


44 


14209-B1 


73 


14210-B1 


95 


1421 1-B1 


109 


142I2-B1 


113 


14213-B1 


249 


142I4-B1 


B3 


15692-B1 


HofT2 


I4203-B1 


HolT3 


14204-B1 


Pa 


12055-B1 


Pb 


12055-B2 


PB-1 


15692-B3 


Pc 


12055-B3 


Pf 


25102-B1 


PP7** 


15692-B2 



7&31 



The American Type Culture Collection 



Felix d'Hercllc Reference 
Centre.QuebecQuebec 



pn** 




J.ViroU983.47:221-223 


o>-mc 




CanJ.Microbiol.l969.15:l 179-1 186 


pn** 




J.Mol.Biol.l991.218:349-364 


PR4** 




J.Gen.ViroU 979.43:583-592 


A7 




J.Bacteriol. 1992. 174:2407-24 1 1 


KF1 




J.Btochero.l983.93:61-71 






MoLMicrobiol. 1 993.4: 1 703- 1 709 



J.ViroU977.24:l35-141 



50 



55 



WO 00/32825 



PCT/IB99/02040 



13! 



10 



15 



20 



25 



30 



<pKZ, 21,<pNZ. 
PMNI7, PTBSO, 
68,PB-l,E79. 

109, 352, 1214, 
F8,71,337, M4, 
<pC17, SL2, B17, 
Li-24, <pmnP78, 
PS17**,<j>l,73, 
M6.U-2, 7. 
<pmnF82, 
PTB2, PTB20, 
PTB42, <pKF77, 
31,PTB21. 
U9x, 

<pPLS27, B3. 
258, 

Hwl2.PM57, 
PM62.PM105. 
148,PM681, 
198, 

218, 222, 242, 
246, 

PC131.<pCll, 
SL5, 

D3112**,Jbl9, 
F7, 

PM69,PM13, 
PM61 t PM113, 
q>240. 249 & 269 



dd 



35 



40 



45 



50 



55 



WO 00/32825 



PCT/IB99/02040 



132 



10 



Pseudomonas 

aeruginosa 

(Cont'd) 



297. 309,318, 
II. 



Arch.ViroU993.131:14M5i 



15 



20 



25 



30 



35 



40 



45 



50 



55 



WO 00/32825 



PCT71B99/02040 





Pseudomonas cepacia 






Felix d'Herelle Reference 
Centre,Quebec,Quebec 


10 










70 


Pseudomonns fragi 


wy 


27362-Bl 
27363 Bl 


The American Type Culture Collection 




Pseudomonas 
phaseolicola 


#> 




Felix d'Herelle Reference 
Ontre.Ouebec, Quebec 


20 


Pseudomonas putida 


Rb-1 


I2633-B1 


The American Type Culture Collection 


Pseudomonas syringae 


46 


40492-B1 
21781-BI 


The American Type Culture Collection 




Pseudomonas sp. 


PPs-G3 


49780-B1 


The American Type Culture Collection 


25 


Salmonella bareilly 


Sab 2 




Felix d'Herelle Reference 

("VnTr-f. Oti^twt/* (\%tm\\ff> 


Salmonella enteritidis 


1,2^&6 




EpidemioLInfect 1995.1 14:227-236 






2a, 3a, 4a, 5a, 6a, 
7a, 8a, 9a, 15. 
19, 20 &21*» 




Vet Med.Nauki. 1975. 1 2:55-60 




Salmonella newington 


Epsilon34 




J.StructBiol. 1995.115:283-289 


30 


Salmonella newport 


16-19 


27869-B1 
27869-B2 


The American Type Culture Collection 


35 








Felix d'Herelle Reference 
Centrc,Quebec,Quebec 




Salmonella paratyphi 


Paratyphoid A 


19940-B1 
12176-B1 


The American Type Culture Collection 


40 




Jersey 




Felix d'Herelle Reference 
Centre.Ouebec.Ouebec 




Salmonella 
senftenberg 


SasLl,SaL2, Sal 
3, 

SaL4, SaL5 & 
SasL6 




Indian J.MedRes. 1997.105:47-52 


45 


Salmonella 
typhimurium 


P22** 
SL-1 


19585-BI 
40282 


The American Type Culture Collection 






MB78** 




J.Virol. 1982.41:1038-1043 - 






SE1 




J.GciLMicrobiol.l986.132:1035-ie41 






LT2 




ViroloRy 1971.45:835-636 






ES18** 




ViroloRv 1970.42:621-632 


50 




L»* 




J.Virol.l985J6:1034-1036 



55 



WO 00/32825 



PCT/IB99/02040 



134 





PlCMclr-100 




Mol.Gcn.Gcnct. 1975. 138:1 13-126 




F22 




GeneUles. 1 986.4 8: 1 39- 1 43 




Feb 1 




J.Gen. Virol. 1978 .38:263-272 




FcU2 
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>Bacceriophage 77, complete genome sequence, 41708 nucleotidee 

1 gatcaaaata ctcggggaac ggttagggag taaactccgc gataatttta aaaattcatg 

61 tataaccccc ctcttataac cattttaagg caggtgatga aatggagact atagtcgatg 

121 aaaatttagt gcctaaagaa aaagaaaggc tacaagtatt atataaagac atacctagca 

181 ataaactaaa agtagttgat ggcttaatta ttcaagcagc aaggctacgt gtaacgcttg 

241 attacatgtg ggaagacata aaagaaaaag gtgattatga tttatctact caatctgaaa 

301 aggcgccacc acatgaaagg gaaagaccag tagccaaact atttaatgct agagatgctg 

361 cacatcaaaa aataatcaaa caattatcgg acccattgcc cgaagagaaa gaagacacag 

421 aaacgccatc tgatgattac ctatgattag taataaatac gttgatgaat atataaattt 

481 gtggaaacaa ggaaagataa ctttaaaeaa agaaagaact gatctcctta attatctaca 

541 aaaacatata tattcacgag atgatgtata ttttgatgaa cagaaaatcg aggatcgtat 

601 caaatctatc gaaaaatggt attttccaac attaccattt caaaggctta tcacagctaa 

661 tatatttctt atagataaaa atacagatga agctntcttt acagaacttg ctattttcat 

721 gggaegtgga ggcgggaaaa acggtctaat aagtgctatc agtgacctcc tttctacgcc 

781 cttacacgga gttaaagaat atcacatctc cattgtcgct aatagcgaag atcaagcaaa 

841 aacatcgttt gatgaaatca gaaccgtttt aatggataac aaacgaaata agacgggtaa 

901 aacgccaaaa gctccttatg aagttagtaa agcaaaaaca ataaaccgtg caactaaatc 

961 ggttattcga tacaacacat caaacacaaa aaccaaagac ggtggacgtg aggggtgtgt 

1021 tatttttgat gaaattcatt atttctttgg tcctgaaatg gtaaacgtca aacgtggtgg 

1081 attaggtaaa aagaaaaata gaagaacgct ttatataagt actgatggtt ttgttagaga 

1141 gggttatatc gatgcaatga agcacaaaat tgcaagtgta ttaagtggca aggttaaaaa 

1201 tagtagattg tttgcttttt attgtaagtt agacgaccca aaagaagttg acgacagaca 

1261 gacgcgggaa aaggcgaacc caatgttaca taaaccgtta tcagaatacg ctaaaacact 

1321 gctaagcacg ategaagaag aatataacga ttcaccattc aaccgttcaa ataagcccga 

1381 attcacgact aagcgaatga atttgcctga agttgacctt gaaaaagtaa cagcaccatg 

1441 gaaagaaata ctagcgacta atagagagat accaaattta gataatcaaa tgtgtattgg 

1501 cggtccagac ttcgcaaaca ctcgagactt tgcaagtgta gggctattat tccgaaaaaa 

1561 cgatgattac atttggttag gacattcgtt tgtaagacaa gggtttttgg atgatgtcaa 

1621 attagaacct cceattaaag aatgggaaaa aatgggatta ttgaccattg tcgacgatga 

1681 tgtcattgaa attgaatata tagttgattg gtttttaaag gctagagaaa aatatgggct 

1741 tgaaaaagtc atagctgata attatagaac tgatattgta agacgtgcgt ttgaggatgc 

1601 tggcataaaa cttgaagtac ttagaaatcc aaaagcaata catggattac ttgcaccacg 

1861 tatcgataca atgtttgcga aacataacgt aatatatgga gacaatcctt cgatgcgttg 

1921 gtttactaat aatgttgctg taaaaatcaa gccggatgga aataaagagc atatcaaaaa 

1901 agatgaagtc agacgtaaaa cggatggact catggcttct gttcacgcac tatatagagc 

2041 agacgatata gtagacaaag acatgtctaa agcgcttgat gcactaatga gtatagattt 

2101 ctaacagagg aggcgagaca tgagtattct agaaaagaca cttaaaacta ggaaagatac 

2161 aacatatatg cttgatttag' acatgataga agacctatca caacaagcgt acgtgaaacg 

2221 tttagcgatt gatagttgta ttgaatttgt tgcgcgagct gtcgctcaaa gtcattttaa 

2281 agtattggaa ggtaatagaa ttcaaaagaa tgatgtttac tacaagttaa atataaaacc 

2341 aaatactgac ttatcaagcg atagtttttg gcaacaagtt atatataaac taatttatga 

2401 taacgaggtt ttaatcgtag taagtgacag caaagaatta cttatcgcag atagctttta 

2461 cagagaagag tacgctttgc atgatgatat attcaaagat gtaaeggtta aagattatac. 

2521 ttatcaacgt actttcacaa tgcaagaggt catatattta aagtacaaca acaataaagt 

2581 gacacacttt gcagaaagtc tattcgaaga ttacgggaaa atattcggaa gaatgatagg 

2641 tgcacaatta aaaaactacc aaataagagg gattttgaaa tctgcctcta gcgcatatga 

2701 cgaaaagaat atagaaaaat tacaagcgtt cacaaataaa ttattcaata cttttaataa 

2761 aaatcaacta gcaatcgcgc ctttgataga aggttttgat tatgaggaat tatctaatgg 

2821 tggtaagaat agtaacatgc ctttttctga attgagtgag ctaatgagag atgcaataaa 

2881 aaatgttgcg ttgatgattg gtatacctcc aggtttgact tacggagaaa cagctgattt 

2941 ggaaaaaaac acgcttgtat ttgagaagtt ctgtttaaca cctttatcaa aaaagattca 

3 001 gaacgaatta aacgcgaaac tcataacaca aagcatgtat ttgaaagata caagaataga 

3061 aattgtcggt gtgaataaaa aagacccact tcaatatgct gaagcaactg acaaacttgt 

3121 aagttctggt tcatttacaa ggaatgaggt gcggattatg ctaggtgaag aaccatcaga 

3181 caatcctgaa ttagacgaac acctgattac taaaaactac gaaaaagcta acagtggtga 

3241 aaatgatgaa aaagaaaaag atgaaaacac tttgaaaggt ggtgatgaag atgaaagcgg - 

3 301 agattaaagg cgtcatcgtt tccaacgaag ataaatgggt ttacgaaacg cttggtatgg 

3361 attcgacttg tcctaaagat gttttaacac aaccagaatt tagtgacgaa gatgttgata 

3421 ttataattaa ctcaaatggt ggtaacctag tagctggtag tgaaatatat acacatttaa 

3481 gagctcataa aggcaaagtg aatgttcgta tcacagcaat agcagcaagt gcggcatcgc 

3541 ttatcgcaac ggctggtgac cacatcgaaa tgagtccggt tgctagaatg atgattcaca 
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3601 acccttcaog tattgcgcea ggagaagtga aagatctaaa tcatgctgca gaaacatcag 

3661 aacatgctgg tcaaataacg gccgaggcat atgcggttag agccggtaaa aacaaacaag 

3721 aacttacaga aatgatggct aaggaaacgt ggctaaatgc tgacgaagcc accgaacaag 

3781 gccttgcgga tagtaaaatg tttgaaaacg acaatatgca aattgtagca agcgatacae 

3S<1 aagtgccacc gaaagatgta ctaaaccgtg taacagcttt ggcaagcaaa acgccagagg 

3901 ttaacattga tattgacgca atagcaaata aagcaatcga aaaaataaat atgaaagaaa 

3961 aggaaccaga aatcgatgct gcagatagta aatcatcagc aaatggattt tcaagactcc 

4021 ttttttaata caaaaatagg aggtcataaa atgactacaa atttatcgga aaeattcgca 

4 OBI aatgcgaaaa acgaatttat taatgcagta aacaacggtg aaccgcaaga aagacaaaat 

4141 gaattgtacg gtgacatgat caaccaacta cccgaagaaa ctaaattaca agcaaaagca 

4201 gaagctgaaa gagtttctag tttacctaaa ccagcacaaa ccttgagtgc aaaccaaaga 

4261 aatttcttta tggatatcaa caagagcgtc ggatataaag aagaaaaact tttaccagaa 

4321 gaaacaactg atagaatctc cgaagattca acaacgaacc atccactact agccgactta 

4361 ggtattaaaa atgctggctt gcgtttgaag tccccaaaat ccgaaacctc tggcgtggct 

4441 gtctggggta aaatctatgg tgaaatcaaa ggtcaattag atgctgcgtt cagcgaagaa 

4501 acagcaactc aaaataaatc gacagcgtct gctgttctac caaaagatct aaacgattcc 

4561 ggtcctgcgt ggattgaaag atttgttcgt gttcaaatcg aagaagcatt tgcagtggcg 

4621 cttgaaactg cgttcttaaa aggtactggt aaagaccaac cgattggctc aaaccgccaa 

4681 gtacaaaaag gtgtatcggt aactgatggt gcttatccag agaaagaaga acaaggtacg 

4741 cttacatttg ctaatccgcg cgctacggtt aatgaattga cgcaagcgtt caaataccac 

4801 tcaactaacg agaaaggcaa atcagtagcg gttaaaggta atgtaacaat ggctgttaot 

4861 ccgtccgatg cttttgaggt tcaagcacag cacacacact caaatgcaaa tggcgtatat 

4921 gttactgctt taccatttaa cttgaotgtt attgagtcta cagttcaaga agcaggtaag 

4981 gttttaacgt acgttaaagg tctatatgat ggttatttag ctggtggtat taatgttcag 

5041 aaatttaaag aaacacttgc gttagatgat atggatttat acactgcaaa acaatttgct 

5101 tacggcaaag cgaaagataa taaagctgcc gctgtttgga aattagattt aaaaggacat 

5161 aaaccagcct cagaagatac cgaagaaaca ctataaaact ttatgaggtg ataaaatggt 

5221 gaaatttaaa gttgctagag aatttaaaga catagagcac aatcaacaca agtacaaagt 

5281 aggggagttg catccagctg aagggcacaa caatcctcgt gttgaattgt tgacaaatca 

5341 aatcaaaaat aagtacgaca aagcctacat cgtaccctta gataagctga caaaacaaga 

5401 atcattagaa ctacgcgaat cattacaaaa aaaagcgtct agttcaatgg ttaaaagtga 

5461 aatcatcgac ttattgaatg gtgaagacaa tgacgattga tgatttgctc gccaaattta 

5521 aatcacttga aaagattgac cataattcag aggatgagta cttaaagcag tcgctaaaaa 

5581 tgtcgtacga gcgtataaaa aatcagtgcg gagtttttga atcagagaat ttaataggtc 

5641 aagaattgat acttatacgc gctagatatg cttatcaaga tttattagaa cacttcaacg 

5701 acaattacag acctgaaaca atagattttt cgttatctct aatggaggta tcagaagatg 

5761 aagaaagtgt ttaagaaacc tagaattaca actaaacgtt taaatacgcg tgtccatttt 

5821 tataagtata ccgaaaataa cggtccagaa gctggagaaa aagaagaaaa attattatat 

5881 agctgttggg cgagtattga tggtgtctgg ttacgtgaat tagaacaagc tatetcaaac 

5941 ggaacgcaaa aegacatcaa attgcacatc cgtgatccgc aaggcgatta ttcacccagc 

6001 gaagaacacc atcttgaaat cgaaccaaga tatttcaaaa atcgtttgaa tataaagcaa 

6061 gtatcaccag atttggataa taaagacttt attatgattc gcggaggata tagttcatga 

6121 gtgtgaaagt gacaggtgat aaagcatcag aaagagaact agaaaaacac ttcggcataa 

6181 aagagatggt aaaagttcaa gacaaggcgc taatagctgg tgctaaggta attgctgaag 

6241 aaataaaaaa acaactcaaa ccttcagaag acccaggagc actgatcagt gagatcggcc 

63 01 gtactgaacc tgaatggata aaggggaaac gtactgttac aattaggcgg cgtgggcctt 

6361 ttgaacgatt tagaatagta catttaattg aaaatggtca tgttgagaaa aagtcaggaa 

6421 aatttgtaaa acctaaagct atgggtggga ttaatagagc aataagacaa gggcaaaaca 

6481 agtattttga gacgctaaaa agggagttga aaaaattgtg attgacattt tgtacaaagt 

6541 tcatgaagtg actagtcaag acagaattat tagagagcac gtaaatacca ataatattaa 

6601 gttcaacaaa taccctaatg taaaagatac tgatgtacct tttattgcta ttgacgatat 

6661 cgacgaccca atacccacaa cttatactga cggagacgag tgcgcataca gtcatactgc 

6721 ccaaatagat gtttttgtta agtacaatga tgaatataat gcgagaatca caagaaacaa 

6781 gatatccaat cgcacccaaa agttattatg gtctgaacta aaaatgggaa atgtttcaaa 

6841 tggaaaaccg gaatatatag aagaatttaa aacatacaga agctctcgcg tttacgaggg 

6901 catcctttac aaggaggaaa attaaatggc agcaaaacat gcaagtgcgc caaaggcgta 

6961 cattaacatt actggtttag gtttcgctaa attaacgaaa gaaggcgcgg aattaaaata 

7021 tagcgatact acaaaaacaa gaggatcaca aaaaatcggt gttgaaactg gtggagaact 

7081 aaaaacagcc tatgctgatg gcggcccaat tgaaccaggg aatacagacg gagaaggcaa 

7141 aacctcatta caaatgcatg cgttccctaa agagattcgc aaaattgctt ctaatgaaga 

7201 ccatgatgaa gatggcgttt acgaagagaa acaaggtaaa caaaacaatt acgtagctgt 

7261 atggttcaga caagagcgta aagacggtac atttagaaca gttttattac ctaaagttat- 

7321 gtttacaaat cctaaaatcg atggagaaac ggctgagaaa gattgggact tctcaagtga 

7361 agaggttgaa ggtgaggcac tcttcccttt agttgataat aaaaagtcag tacgtaagta 

7441 caeetttgat tcagctaaca tgacaaatca tgatggagac ggtgaaaaag gcgaagaggc 

7501 tttcttaaag aaaattttag gcgaagaata tactggaaac gtgacagagg gtaacgaaga 

7561 aactttgtaa caaaaccggc ttcatcggaa actgcggtaa agtcggttaa tataccagat 

7621 agcattaaaa cacttaaagt tggcgacaca tacgatttaa atgttgtagt agagccatct 
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7661 aatcaaagta agctatcgaa acacacaaca gaccaaacga acattgtatc aatcaatagt 

7741 gatggtcaag ctaccgcgga agcacaaggc actgctacgg ttaaagcaac agttggcaat 
7801 atgagcgaca ccacaacaac aaatgtagaa gcataagagg gggcaacccc tctattttat 
7861 ttgaaaataa ggagagtact ataaaatggc aaaatcaaaa cgtaacatta ctcaaccagt 
7921 agaagaccca aaagcaaacg aaattaaatt acaaacgtac rtaacaccac acttcatttc 

7981 attegaaatt gtatacgaag caatggattt aatcgatgat attgaggacg aaaatagcac 

8041 gacgaagcca agagaaatcg ccgacagact gatggatatg gttgtaaaaa tttacgataa 

8101 ccaatccaca gttaaagacc caaaagaacg catgcatgca cctgacggaa tgaatgcact 
8161 tcgtgaacaa gtgattttca tcactcaagg tcaacaaact gaggaaacta gaaattttat 

8221 ccagaacatg aaacaaagcc tgaagattca acatataaag caacgttgaa aaataeggat 

8281 actctcacga cggacctaat tgaaaacggt aaagacgcea acgaagtttt aaaaatgcca 
8341 tttcattatg tgctttccat atatcaaaat aaaaataatg acatttctga agaaaaagca 

6401 gaggctteaa tegatgcatt ttaaccctaa ccgttegget agggttactt ccttgaactt 
8461 tctcagaaag gaggtaaaaa atgggagaaa gaacaaaagg tctatccata ggcctggact 
8521 tagacgcagc aaacttaaac agatcatttg cagaaatcaa acgaaacttt aaaactttaa 

8581 attctgactt aaaattaaca ggcaacaact tcaaatatac cgaaaaatca actgatagtt 

8641 acaaacaaag gattaaagaa cttgacggaa ctaccacagg ttacaagaaa aacgttgatg 

8701 acctagccaa gcaatatgac aaggcatctc aagaacaggg cgaaaacagc gcagaagctc 

8761 aaaagtcacg acaagaatat aacaaacaag caaatgagcc gaattattta gaaagagaat 

8821 tacaaaaaac atcagccgaa tttgaagagt ccaaaaaagc tcaagttgaa gctcaaagaa 

8681 tggcagaaag tggctgggga aaaaccagta aagtttttga aagtatggga cctaaattaa 

8941 caaaaatggg tgatggttta aaatccattg gtaaaggttt gatgattgge gcaactgcac 

9001 ctgctttagg tattgcagca gcatcaggaa aagcttttgc agaagttgat aaaggtttag 

9061 atactgttac tcaagcaaca ggcgcaacag gcagtgaate aaaaaaattg cagaactcat 

9121 ttaaagatgt ttatggcaat tttccagcag atgctgaaac tgtcggtgga gttttaggag 

9181 aagttaatac aaggttaggt ttcacaggta aagaactcga aaacgccaca gagtcactct 

9241 tgaaattcag tcatataaca ggtcctgacg gtgtgcaagc cgcacageta attacccgtg 

9301 caatgggcga tgcaggtatc gaagcaagtg aacatcaaag tgtcttggat atggtagcaa 

9361 aagcggcgca agctagtggg ataagtgtcg atacattagc tgatagtatc actaaatacg 

9421 gcgctccaat gagagctacg ggccttgaga tgaaagaatc aattgcttca tcctctcaat 

9481 gggaaaagtc aggcgttaat actgaaatag cattcagtgg tttgaaaaaa gctatatcaa 

9541 attggggtaa agctggtaaa aacccaagag aagaatttaa gaagacatca gcagaaattg 

9601 aaaagacgcc ggatatagct agcgcaacaa gtttagcgat tgaagcattt ggtgcaaagg 

9661 caggccctga tttagcagac gctattaaag gcggtcgctt tagttatcaa gaacttctaa 

9721 aaaccattga agactcccaa ggcacagtaa accaaacatt taaagattct gaaagcggct 

9781 ccgaaagatt taaagtagca atgaataaat taaaattagt aggtgctgat gtatgggctt 

9841 ctattgaaag tgcgtttgct cccgtaatgg aagaattaat caaaaagcta tctatagcgg 

9901 ttgactggtt tcccaattca agtgatggtt ctaaaagacc aattgttatt ttcagtggca 

9961 ttgccgccgc aattggtcct gtagttctcg ggttaggcgc atttataagt acaattggca 

10021 atgcagtaac tgtattagct ccatcgtcag ctagcatcgc aaaggccggt ggattgatta 

10081 gttttttatc gactaaagta cctatattag gaactgtctt cacagcttta actggtccaa 

10141 ctggcactgt attaggtgta ttggctggtc tagcagtcgc acttacaatc gcttataaga 

10201 aacccgaaac atttagaaat tttgttaatg gtgcaattga aagtgctaaa caaacaccta 

10261 gtaatcctat tcaatttatt caacctttcg ttgattccgt taaaaacatc tttaaacaag 

10321 cgatatcagc aatagtcgat tccgcaaaag atacttggag tcaaatcaac ggattcttta 

10381 atgaaaacgg aatttccacc gctcaagcac ctcaaaacac acgcaacttt actaaagcga 

10441 tatctgaact tatcttaaat tttgtaatta aaccaattac gttcgcgatt tggcaagtga 

10501 tgcaatttat ttggccggcg gttaaagcct tgattgccag tacttgggag aacacaaaag 

10561 gtgtaataca aggtgcctca aatatcatac tcggcttgat taagttctcc tcaagcetat 

10621 tcgttggtga ttggcgagga gtttgggacg ccgttgtgat gattcttaaa ggagcagttc 

10681 aattaatteg gaatttagct caattatggt ttgtaggtaa aacactcggc gctgttaggt 

10741 actttggcgg gttgccaaaa ggattgatag caggaatttg ggacgtaata agaagtatat 

10801 tcagtaaatc tttatcagca atttggaatg caacaaaaag tatttttgga tttctactta 

10861 atagcgtaaa atcaattttc acaaatatga aaaattggtt atctaacacc tggagcagta 

10921 tccgtacgaa tacaatagga aaagcgcagt cattatttag tggcgtcaaa tcaaaattta 

10981 ctaatttatg gaacgcgacg aaagaaattc ctagtaatct aagaaattgg atgtcaaata 

11041 tttggaattc cattaaagat aatacggtag gaattgcaag ccgtttatgg agtaaggtac 

11101 gcggaacttt cacaaatatg cgcgatggcc tgagttccat tacagataag attaaaagtc 

11161 atatcggcgg tatggtaagc gccattaaaa aaggacttaa taaattaatc gacggtttaa 

11221 actgggccgg tggcaagttg ggaacggata aaatacccaa gttacacact ggtacagagc 

11261 acacacatac tactacaaga tcagctaaga acggtaagac egcacgtgac acattcgcta 

11341 cagttgggga caagggacgc ggaaatggtc caaatggttt tagaaatgaa atgattgaat . 

11401 tccctaacgg caaacgtgta atcacaccta atacagatac taccgcetae ctacctaaag 

11461 gctcaaaagt atacaacggt gcacaaacct attcaatgtt aaacggaacg cttccaagat 

11521 ttagtttagg tactatgtgg aaagatacta aatctggtge ateatcggca tttaactgga 

11S81 caaaagataa aataggtaaa ggcaccaaat ggcttggcga taaagttggc gatgttttag 

11641 attttatgga aaatccaggc aaactcttaa atcatatact tgaagctttt ggaactgatt 

11701 tcaatcctct aactaaaggt acgggaattg caggcgacat aacaaaagct gcatggccta 
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agateaagaa 
aettagtcgg 
cttataccgc 
aagaagttag 
atggtaatta 
accctagcaa 
ctggtaatac 
gacattctga 
gtggtggcgg 
cgcaaagtat 
ttgcaaaacg 
aaagaggaga 
ctaaacgtgg 
acattgtcag 
caggtggaaa 
ttattccaac 
cagaagtaag 
acgggtttga 
cttcattact 
ttattgacga 
aagaatcaac 
taaagcgaac 
ttctaattat 
gcgtaggctt 
tcacaacggc 
cgaggaacaa 
aggaccaata 
actaacagac 
agtttcagtt 
taaaccacct 
cgatgaggta 
tgatttcaaa 
taaggccggc 
tcccgacgca 
agattttcaa 
agcacaacat 
atatcatgat 
caaaaagata 
tcatatgcgg 
cattaaagac 
cggtaagttt 
ttataagtgg 
gaaaggcgca 
aaaaagtgtt 
tctcaatgtt 
gacggttaaa 
agate ctaac 
gacccacaaa 
aagagccgaa 
gcgegaattt 
tatagegtet 



cgaacaaacc 
tgaagtttta 
tagctctaat 
aggtaaagaa 
agaaatcaaa 
gctagttgtg 
ggggatatat 
agecaaaaca 
tgatttggaa 
acatagagat 
caeaatttea 
atcacgagaa 
tagcaatatc 
caaaatacac 
tacaagtaac 
aacaccaaat 



aagtgctact 
eggaatacta 
cgcaactgga 
aacgecgatg 
cgtaaaaatt 
atcaccacct 
eggatttagt 
ccctgaacca 
tgetacttet 
ttcaggtggt 
tgaaagtaac 
cccatcaaga 
atataccaac 
acgatatggt 
agtccttgat 
agatccagct 
agggaaaaaa 
tgatcctagc 
gaaaatagca 
ataegctett 
aaaagtaaag 
aacaaaacaa 
gttttaaaaa 
gaatcttata 
attaaaacac 
gttaaattac 
aagctgeaca 
ccttacaaat 
gtaaatagtg 
agttacttta 
accaaagaag 
ggttggacta 
ggtgactttg 
aaaggctggg 
accacccata 
atttatgaca 
agaaaaatag 
tacgactatc 
ctcagaagag 
ccagatagac 
catcagcgtc 
atggagatga 
agggatgtca 
gtcatcaatg 
gattctgggt 
tggcaagata 
gacaagatta 
cgtaatgtta 
aagttccgtg 
attattaact 
tatcttgecg 
acttcagaag 
gaacacgacg 
aagcaatcat 
acegtcaaag 
ategaatatg 
acagcattaa 
acagacgacg 
gaaccacaac 
gagttaaata 
gttaegtate 
tttaacccgc 
gaaaatagca 
gagttcaaca 
aacactatag 
aaaagegata 
cccgatgtcg 
gatgttgaaa 



gattggataa 
gaccctgaca 
agaccatttc 
ggeggcagae 
actageggeg 
agtggcacga 
acaggaccac 
tatttaagga 
ggaagcggcg 
cgtcataaag 
taccagccaa 
ggattattcc 
tttaataatc 
tggggtggtt 
ggttggtata 
cgtagaaatg 
gcgagcaaaa 
ttactattga 
caatctaacg 
gataaaaagg 
tctagaaaag 
ttcctcggtt 
cagaaaatgt 
gttctgatat 
atgatgacgt 
aattcaaatc 
aagaatttac 
attcagtaac 
ggactgctga 
tgattactaa 
ttaaggaeta 
agatgattac 
tgatatccaa 
ttggcgctgg 
aatgtatcgt 
gtgatggcaa 
gacatattgt 
agaacaaacc 
taggtaataa 
gtaaacctat 
cagctcctat 
atgggttagg 
ttacacaaaa 
aggaaccaat 
acagtgaatt 
gatattcata 
tagactrcct 
atgacaattc 
aacgacatcg 
gggttcaaga 
atataacaac 
cactgaaaga 
gettaegtae 
gtacaaccta 
gtagatatgt 
gcaaagatct 
tcgctgtggg 
aagcgcaaag 
cagatgatca 
aacgcaagcc 
cgcacgagat 
cattgtatgt 
catatacatt 
agegatcgaa 
ttaaagatgt 
caccgccaga 
ctgtcttgcg 
aattaggegg 



aagaaaatct 
aaattaatta 
atgaaggtgt 
ttacaagaac 
ttaecgatat 
tggtaaagee 
atttacattt 
atgctaagaa 
caacttatgc 
gtaaatggat 
atgcagtgaa 
aaatcacegg 
cagtacatca 
ttaaacgtgc 
acttaggtga 
aegcaatgaa 
ataagcgccc 
aaaegatcga 
atgtgatcgc 
tgaacgcgtc 
gaggaattgc 
gtatgccgaa 
agatggacgt 
acctttggtg 
cttgaatgaa 
taaagattgg 
aatacctgtt 
aggaaataaa 
cactccttta 
aaacgatgaa 
catgcctcct 
cgaagatatt 
tcttggcgaa 
cacgaaacga 
tgaacaaaaa 
gttacttget 
tgtcacgttg 
gataatgtat 
attttctatt 
tgatatggat 
catagctgtc 
ttcattcaat 
aggtgattta 
gttgagegag 
aatcatacaa 
gaaaggagat 
ttctactgat 
agaaacgett 
tgttattata 
tacgatggac 
agccaaaccg 
tgtgctgagc 
tacgtcatgg 
taaaatggtt 
agtactcaaa 
ageegggtto 
acctgaaaac 
tcaattcaac 
aaacatgaat 
ggcagttatg 
tatatcaacc 
agaggcagaa 
cggecaacct 
cataatacat 
tgtagatggt 
aaacccagtc 
tagatattgg 
tataacaaga 



agaagctatg 
tcattatgga 
cgattttcca 
gecatttatg 
getatttgeg 
cggtgacgtt 
tgaaacgagg 
aaaeggaaga 
cagecgagta 
tcatgaccaa 
taactgggat 
ctcaactttt 
aggtatctca 
eggtgactae 
agaeggtcat 
gattttgeat 
cagccaatca 
acaacagcaa 
agataaagat 
tatagaaaag 
tatccaatga 
»9«gggtttg 

teggggtcta 
gtacgtaatg 
ttagtaaagt 
tactggaacg 
aagttcacta 
aatactgega 
attgttgaag 
gaetaettca 
gtttaccata 
ccaagtaatg 
ggatataaag 
gggctcccta 
ggtaaaggtg 
tctattggtt 
tataaccaaa 
aacttggaca 
aaaacttgga 
gagaaagagt 
tatagtgega 
aeggagatte 
gtaaaaatag 
aaatcgttcg 
cctgaaaacg 
gagagtgtga 
gacccttcct 
gaactgetea 
agggattcaa 
ggctacacag 
tatgeaccag 
gatacaggtt 
acttcttacc 
ccagattttt 
aagaaaaaca 
actaggaaga 
gacaaaggga 
ctacctatgc 
gaaacacgat 
ccatatgaga 
ggcgatacag 
gttattgctg 
aaagagttca 
caaaagctaa 
gaattagaat 
aatgataege 
aatggtcgat 
gagaaagege 



ggcggtggcg 
cgtaccgcag 
tttgtatatc 
cctggtggtt 
catttgaaaa 
gttggcctaa 
agaaatggac 
ttatcaacag 
atccgacaag 
aegatgegeg 
acaaacgctc 
agagcaaacg 
geaaegcagt 
geatatgeta 
ccagaatgga 
tatgeagcag 
tcagacttaa 
caacaaatag 
tatcagcega 
cgagaaaggc 
tagacaccat 
aaataccctc 
tatataaagg 
actatttatc 
ctcctaacca 
cttatttcga 
tcaaagtagt 
tttcagacca 
cccgagcaat 
tggttggtga 
gtgagtetcg 
acttaggtgg 
caaccaattt 
aagegatgae 
ceggaagaac 
acgaaaataa 
aaggagaccc 
gaatcgttgt 
aatttgacca 
ggatagatgg 
agtacaaegg 
taccgaaacc 
atatgeaage 
gaagtaatta 
tctttgatac 
tacatgtttt 
tagttagagc 
tatcatcaga 
acaaacaatg 
agacagaatg 
gcaaatttga 
gggaagtttc 
aaaccagata 
acattgagee 
gcttactcaa 
tcgatatgtc 
agegctcaga 
gctatatttg 
taagttcttt 
ttactcccac 
tcagagcaaa 
aagaatataa . 
aagaaccaga 
acgataatat 
accttgaacg 
tttggtatga 
ggattgaagc 
tattcagtga 
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15841 attaaacaat acttttacta acttacctat acaacacgcc agtcttttgt cagaagctac 

15901 agaaccaccg aacagcgagt actcagtaga taatgatttg aaagcggact cacaagcaag 

15961 ttcagacgct gtgattgatg tttataatca aattaaaaat aatttagaat ctacgacacc 

16021 cgaaaccgca acgatcggtc ggttggtaga tacacaagct ttatttcteg agtatagaaa 

16081 gaaattacaa gatgtttata cagatgcaga agacgccaaa atcgccattt cagatagact 

16141 taaattatta cagtcacaat acactgatga aaaatataaa gaagcgttgg aaataatagc 

16201 aacaaaactt ggtttaacgg tgaatgaaga tttgcagtca gccggagaac ccaacgtcgc 

16261 taaatcagcc attgaagcag ctagagaatc cacaaaagaa caatcacgtg actatgtaaa 

16321 aacatcggac tataaaacag acaaagacgg tattgttgaa cgttcagata ctgctgaagc 

163 81 cgagagaacg actttaaaag gtgaaateaa agacaaagtt acgtcaaacg aataccgaaa 

16441 cggattggaa gaacaaaaac aatatactga tgaccagtta agtgatttgt ccaataatcc 

16501 tgagattaaa gcaagtattg aacaagcaaa tcaagaagcg caagaagctt caaaatcata 

16561 catcgatgct caagatgatc ttaaagagaa ggaatcgcaa gcgtatgctg acggtaaaac 

16621 ttcggaagaa gagcaacgcg ccacacaaga tgctcaagct aaacttgaag aggcaaaaca 

16681 aaacgcagaa ctaaaggcta gaaacgctga aaagaaagct aatgctcata cagacaacaa 

16741 ggtcaaagaa agcacagatg cacagaggaa aacattgact cgctatggtt ctcaaaetat 

16801 acaaaacggc aaggaaatca aateaagaac tactaaagaa gagtttaatg caaccaaccg 

16861 tacactttca aatatattaa acgagattgt tcaaaatgtt acagatggaa caacaatcag 

16921 acacgacgat aacggagtgg ctcaagcttt gaatgcgggg ccacgtggta ttagattaaa 

16981 tgccgacaaa attgatatta acggtaacag agaaaeaaac cttcetatcc aaaatatgcg 

17041 agataaagta gacaaaaccg atattgtcaa cagtcttaat ttatcaagag agggtcttga 

17101 taccaacgtt aacagaattg gaactaaagg cggcgacaat aacagatatg ttcaaataca 

17161 gaatgattct ategaactag gtggtattgt gcaacgtact tggagaggga aacgttcaac 

17221 agacgatatt tttacgcgac tgaaagacgg tcacctaaga tttagaaata acaccgctgg 

17281 cggttcacrt tatatgtcac atttcggtat ttcgacttat actgatggtg aaggtgaaga 

17341 cggtggttca tctggcacga ttcaatggtg ggataaaact tacagtgata g egg c a eg a a 

17401 tggcataaca atcaattcce atggtggcgt cgccgcacta aegtcagata ataategggt 

17461 tgetceggag tettacgett catcgaatat caaaagcaaa caggcaccgg cgtatttata 

17521 cccaaacaca gacaaagtgc ctggaccaaa ccgatttgea ttcacgctgt etaatgeaga 

17581 taatgettat tcgagtgacg gecatactat gtttggttct gatgagaact atgattaegg 

17641 tgegggtate aggttttcta aagaaagaaa taaaggtctt gctcaaattg ctaatggacg 

17701 acatgeaaca ggtggagata caacaaccga agcagggtat ggcaaactta atatgetgaa 

17761 aegaegtgae ggtaataggt atattcatat acagagtaca gacctactgt ctgtaggttc 

17821 agatgatgea ggagatagga tagcttctaa ctcaatttat agaegtaett attcggccgc 

17881 agctaatttg catattactt ctgctggcac aattgggcgt tcgacatcag cgegtaaata' 

17941 caagttatct atcgaaaacc aatataacga tagagatgaa caactggaac attcaaaagc 

18001 tattcttaac ttacctatta gaacgtggtt tgacaaagct gagtctgaaa ttttagccag 

18061 agagctgaga gaagatagaa aattatcgga agacacctat aaacccgaca gataegcagg 

18121 ectgattget gaagaggegg agaatttagg atcaaaagag ttcgtcacgt atgatgacaa 

18181 aggagaaact gaaggcacag egtatgateg tctatggatt catcetatcc ctgctatcaa 

18241 agaacaacaa ctaagaacca agaaattgga ggagccaaag aacgeaggat aacaaacaag 

18301 gatcacaagc taaccctgaa tatacaattc at tat t tat c acaggaaatt atgaggttaa 

18361 cacaagaaaa cgcgatgtta aaagegtata tacaagaaaa taaagaaaat caacaatgtg 

18421 ctgaggaaga gtaatcctta gcactatttt tatacaaaaa tttaaggagg tcatttaatt 

18481 atggcaaaag aaattatcaa caatacagaa aggtttattt tagtacaaat cgacaaagaa 

18541 ggtacagaac gtgeagtata tcaagatttc acaggaagtt ttacaacttc tgaaatggtt 

18601 aaccacgctc aagattttaa atctgaagaa aacgecaaga aaatcgcgga gacgttaaat 

18661 ttgttatatc aattaactaa caaaaaacaa cgtgtgaaag tagttaaaga agtagttgaa 

18721 agatcagatt tacctccaga ggtaacagtt aacactgaaa cagtatgaaa agctatgagt 

18781 tagatactca tagtctttat tcttttagaa agcgggtgta ctgaattggg gcggttcaaa 

18841 aaacacgaac acgaatggcg catcagaagg ctagaagaga atgataaaac aatgetcage 

18901 acactcaacg aaattaaatt aggtcaaaaa acccaagagc aagttaacat taaattagat 

IB 961 aaaaccttag atgetattea aaaagaaaga gaaacagatg aaaagaataa gaaagaaaat 

19021 gataagaaca toegtgatat gaaaatgtgg gtgcttggtt tagttgggac aatatttggg 

19081 ccgccaatta cagcattatt gegtacgett atgggcatat aagagaggtg attaccacgt 

19141 teggattaaa ttttggagct tcgctgtgga cgtgtttctg gtttggtaag tgtaagtaat 

19201 agtcaagagt cagtgetteg gcactggctt tttattttgg ataaaaggag caaacaaatg 

19261 gatgeaaaag taataacaag atacategto ttgatcttag cattagtaaa tcaattctta 

19321 gcgaacaaag gtattagece aattccagta gacgatgaaa ctatatcatc aataatacct 

19381 actgeagteg ctttatatac aacgtataaa gacaatccaa catcccaaga aggtaaatgg 

19441 gcaaatcaaa aattaaagaa atataaagee gaaaataagt acagaaaagc aacegggcaa 

19501 gegecaatta aagaagtaat gacacctacg aatatgaacg acacaaatga tttagggtag 

19561 gtggttgata tatgetaatg acaaaaaacc aagcagaaaa atggtttgac aatccattag 

19621 ggaaacaatt caacccagat ggccggtatg gatttcagtg ttatgattac gecaatatgt 

19681 tctttatgtt agegacagge gaaaggctgc aaggtttaca cgcttataat atcccgtttg 

19741 acaataaagc aaagactgaa aaatatggtc aaataattaa aaactatgac agctttttac 

19801 cgcaaaagtt ggatattgtc gttttcccgt caaagtatgg tggeggaget ggacacgttg 

19861 aaattgttga gagegcaaat ttaaatactt tcacaccatt tggtcaaaac tggaacggta 
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aaggttggac 
tccactatta 
ttggcaataa 
aacctaaaaa 
acggaacaaa 
taagacatgc 
atcaagatac 
ttaaatcaca 
caagtggtgg 
tacaagatgt 
tactaaatgt 
tcatcaccaa 
taatagccgg 
caccagccaa 
atgtccctta 
taagagacgg 
ttacgtatga 
gtggacaacg 
gttttggtaa 
tatagggaat 
tttttaacat 
tattttttta 
caccaactat 
gatagagagc 
eaacagttea 
tagccgggca 
cacccgatac 
tcgatacggt 
attgtggtta 
caaacgctcg 
tttggacgct 
cttgtgttaa 
aaaaaagggc 
taactctctg 
tgtatgcccc 
tatgtgtgta 
agctgaggac 
gaatataaac 
cattatttet 
cgcggtccca 
gccattaata 
ctcacctacg 
agctctatac 
catctctaaa 
cttctttggt 
ggcgtatcta 
tacgtttgat 
taaattttga 
cgttacttta 
aaaagccaaa 
ttctaaacga 
tacacgtttc 
ttcattgttc 
aaaaaacaat 
aaaatacaga 
aacatatacg 
agggacctgc 
tattactgga 
ctggaaagaa 
tgatactatg 
atcagatata 
ttttttatcg 
aaacaatcct 
ttctttagag 
catcactact 
aaaaactatc 
aaacctcttg 
agtttttaat 



taatggcgtt 

tgacaatcca 

agctaaaggt 

aattatgctt 

cgaacgcgac 

aggacatgaa 

tgcatacggt 

ggggtaegac 

gcatgttatt 

cactaaaaat 

caatgtacca 

caaaaatgat 

tgcgattcat 

aaacaaaaaa 

taaaaaagaa 

ttatccaact 

cggcgcatac 

tcgtcatata 

gtccagcacg 

cttacagtta 

ttctctcaag 

tgttatagct 

ttaeatctat 

atagttttca 

cggggtgctc 

gaggccatgt 

atatatctta 

tatatttatt 

ctttttgcgc 

tggaaaagct 

cgtgtacgtt 

aaagccttta 

agaaaaaggg 

tccatcttct 

actctttcca 

tgccttagtg 

aatcgtttgt 

cctctatcaa 

ttcaatacat 

gtagtatett 

gcgaccgttt 

cgcatacctg 

tgcatgttat 

tagttataca 

agtgtgacgc 

acagcttctt 

aatttgttaa 

gaaccgctct 

aagccagacg 

gtttttaatt 

aacattgcct 

catttatctg 

ttatttttaa 

aagggtaggc 

cgccacttat 

tgttttaaag 

aatatattat 

tttttaattt 

tttatgcaag 

ttattaatgt 

aattcaacaa 

aaaacttctt 

aaataacact 

gacaagggaa 

gcaaagtgtg 

tctccttgtt 

agtaaatagt 

ttattaatge 



gcgcaacctg 
atgtatttta 
attattaagc 
gtagccggtc 
tetataegta 
gttgeattat 
gttaatgtag 
attgttctag 
atctcaagtc 
aacttaggac 
gcagaaataa 
atggattgga 
ggtaagecta 
aatccaccag 
caaggcaatt 
aattcaagaa 
tgtattaatg 
gcgacaggag 
atttagtatt 
ttaaataact 
atttaaatgt 
agectteggg 
ccttgttcac 
tactactccc 
ttatgttata 
atctgactgt 
acaacataga 
cccctacaac 
ttttttgggg 
aaaaggttaa 
agagaatgac 
atatcagctg 
cagatacctt 
ctgttacatg 
taattgcttt 
tgtgagtagt 
ttatcctact 
catagcttgg 
ttgetatect 
tgtgaccaaa 
tatttttgag 
ttaaagcttg 
tatcgttcag 
ttttegctte 
tatttaatat 
tcatatgtcc 
taaatgtttg 
ttttgatgtt 
tttttatatg 
egcttgaega 
etttttgega 
tataeggate 
atttttcaaa 
gggctaccca 
aattataaga 
gataaacctt 
tattaattet 
tttggggtaa 
cgtaactatt 
ttctgtcaat 
aataatcttt 
ttaatatagc 
cccatttcaa 
taacatttac 
aattagaaaa 
taaactttgg 
gaatatctga 
gtttttctat 
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gttggggtcc 

ttaggttaaa 

aagegactae 

atggttataa 

aatatataac 

acggtggctc 

gcaataaaaa 

aaatacattt 

aattcaatgc 

aaataagagg 

atataaatta 

ttaagaaaaa 

taggtggttt 

tgecagcagg 

acacagtagc 

ttacaggggt 

gttatagatg 

aggtagacaa 

tacttagaat 

atttggatgg 

agataacagg 

ctagtttttt 

ccaagcatgt 

cgtagtatat 

attgetttta 

rggtcccaca 

aatgttacat 

caacaaaacc 

caaaaaaagg 

aaatgacaaa 

eggtttacea 

ttacaaagga 

ttagtacaca 

tgtatacacc 

taacgatata 

aactttttta 

gecttgeata 

ttcccattgt 

tgaattgatg 

tccagcatta 

gtcaacatct 

aacttctaca 

tataaaatcg 

ttctttetct 

gtgttcgttt 

aagttgacgc 

catgtacttt 

tttgattctt 

atattcaagc 

cttgttgttt 

tegctttgta 

tttgtatttc 

ccacatttta 

tgaaaattgt 

ttacatggtt 

taatatatta 

atttatcagt 

aacttttctt 

accttttaat 

tttatttaat 

agtgatgaat 

tgaattattt 

atcaaaattc 

tatatcctcc 

ttctttatta 

acaaaaacct 

atctaacttt 

attatgegtc 



tgaaactgtg acaagacatg 
cttccctaac aacttaagcg 
aaaaaaagag gcagtaatta 
cgatcctgga gcagtaggaa 
gectaatate gecaagtatt 
aagtcaatca caagatatgt 
agattatggc ttatattggg 
agaegcagea ggagaaagcg 
agatactatt gataaaagta 
tgtgacacct cgtaatgatt 
tegtttatet gaattaggtt 
ctatgacttg tattctaaat 
ggtagctggt aatgttaaaa 
ttatacactc gataagaata 
taatgttaaa ggtaataatg 
attacccaac aacacaacaa 
gattacttat attgetaata 
ggcaggtaat agaataagta 
aaaaattttg ctacattaat 
atgttaatat tcctatacac 
caggtacttc ggtacttgee 
gttatgatgt gttacacatg 
cactggatgt tttttcttgc 
atgactttag cattcccgta 
tatagtagga gtgaactata 
ggagacatct tccttgtcat 
ccgctataac cgtatcttaa 
acagatccta ttaatttagg 
gcagattatt tgaaaaaggg 
aaccttgata caacagtgtt 
tcatacaagg gtgggattaa 
tttgtagcgt ctttaaaaat 
agtttttcta atttttgetc 
tttatagtcg ttttttcatc 
ttcatttccg ccaataaact 
tttatattta atgattctgc 
ggatttcctt ggcaagttgt 
tgcatctttt tattttctaa 
gegattttte ttcttgaacc 
catttgattc tgtgaatagt 
ttaacttgga gagctaataa 
gccccagcaa ctaaaatacg 
cgtatctgta ttacctgttc 
atatcttcta tegtcttact 
ggataattgt aaaatttaac 
tttacctgat ttgeagaata 
gtatcaattt tgtttaaaag 
gttttcaaat tatcaagegt 
cattcatcta ataacgegtg 
agtttttctt ttattttttc 
ttcttattca agacaacact 
tegtagtate tatacttcgt 
catccctcct caaaattggc 
ataaaaaaag aegectgtat 
aattaccaaa aatggtaacg 
aaattatatc atcttatatc 
aacataatat ccgaagaatc 
atgegaaact tactaategg 
ttttttacct tatcaattgc 
ttattttcaa tttctaaact 
tctgtgttgt ttttttggta 
tgcgcgctaa ttaaatttaa. 
atctttaaat actttttgtt 
gtattagaat catttttatt 
aegtttatae cgaaatctac 
ttatggtctt tttcaccttc 
ttaaattttg gatttccaga 
atcatttctc ctttattctc 
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24001 gctcacactc ccaccaccat tcaacgtcta cactcgcagg cgttttctga tcagtaaaat 
24061 cacaacgaat cttctctggt taacttaccg ccatctactt ctcgtgaaat aaattccaag 
24121 cattcacgcg caccatgtga cgataaaccc ctaggtaacc cataagtgaa tggttgatta 
241B1 ecactagtta aaacttcata taccatagct tcctcctcca ttttgcaatt agetattttc 
24241 attataaact ccttttaaac actgctgaaa tagaegtctt tttcaaataa gcatgattaa 
24301 tactttaott ctttaatcca catatattta aaagtgaggt agtaggtaac aaacataaga 
24361 cctaaagtta agattgctct tttcacgcca attectcctc tgtttatatt tatattaaag 
24421 cgctaaatat acgttattaa tcacaataca actttgccca ttactttaat atcactaaac 
244 81 gaagcgactt tgatatcatc atactccgga tttagagata ccaaattaat atagtcttcg 
24S41 catatatcta cacgcttgat aagacttact ccatctaata caacgagtgc aactgtacca 
24601 tcttcaacag aatcctcttt cttaacaaaa gcgcacgttc cctgtcttaa cataggttcc 
24661 attgaatcac cattaactaa aatacaaaaa tcagcacctg acggcgtttc gtcttcttta 
24721 aaaaatactt cttcatgcaa tatgccacca tataatectt cccctatgcc agcaccagtt 
24761 gcaccacatg caatatacga tactagttca gactctteat attcatctat agaagtgact 
24841 ttattctgtt catctaactg cccatttgca tagttaagta cgttttcttg gcggggaggc 
24901 gtgagttgag aaaatatgtt attgattttt gacattatcg tttcatcttg acgttctccg 
24961 tcaggaactc gataagaaec cacaccacac cccataagcc acgcctcacc gacatccaaa 
2S021 gttttagaca acaagaataa tttacgttgg tctggagaag accttccatt aacatactgg 
2S081 gataagtgac tttttgacat cttaatattc aatcctcttt gaaagggttt cgacttttct 
25141 agaacatcta cttgacgcaa gttcctatcc ctcacaactt gctttaatct ttcagaagcg 
2S201 ttttgcattg gtaatgcctc cttgaaattc attatatagg aagggaaata aaaaccaata 
25261 caaaagccca acctttttaa ctttttgtgt tgacatcgtt caaaatcggg gttatagtca 
25321 ttatagttca aatgcttgaa cctaggaggc gattatttga atactaatac aacttttgat 
253 61 ttttcgttat tgaacggtaa gacagccgaa gcgtacccga cacaacttaa ctttgctaca 
25441 gccttaggeg catcagaaag aacttcgccc ccgaagctga acaacaaagt accatggaaa 
25501 acaacagaca ttatCaaagc ttgtaagtca ttgggaatac ctacaaaaga cgttcacaaa 
25561 tatcttttta aacagaaagt tcaaatgttt gaacttaaca agtaaaggag gcataacaca 
25621 tgcaagaacg agaaaaggtc aacaaaagta acacatcttc aaatgaagca ccaaaacctt 
25681 ttaggacaaa ttgaagctta cgacaaaacg cttaaagaaa taaagtacac tcgagacctt 
25741 tacaacaaac acctaagcat gaacaacgaa gacgcactcg ctggtttgga aatggtagag 
25801 gatgaaatta ctaaaaagcc acgaagtgct atcaaagagt tccaaaaagt agtgaaagcg 
25861 ttagacaagc ttaacggcgt tgaaagcgat aacaaagtta ctgacttaac agagtggcgg 
25921 aaagtgaatc agtaacattc acttcttaat ataaccacgc ctaccaacat ccacattgag 
25981 cagatgcgag cgagagccgg cgatgatacg agccgcgttt aaatacattc gatagccatt 
26041 gcgataaccg cctgctgaat gtgggtgttg aggaaaaagg aggatactca aatgcaagca 
26101 ccacaaacac ctaatttcaa agagctacca gtaagaacag tagaaattga aaacgaacct 
26161 tattttgtag gaaaagatat tgctgagatt ttaggatacg caagatcaaa caatgccatt 
26221 agaaatcatg ttgatagcga ggacaagctg acgcaccaat ttagtgcatc aggtcaaaac 
26281 agaaatatga ccattatcaa cgaatcagga ttatacagtc taatcttcga tgcttctaaa 
26341 caaagcaaaa acgaaaaaat tagagaaacc gccagaaaat tcaaacgctg ggtaacatca 
26401 gatgtcctac cagccattcg caaacacggt atatacgcaa cagacaatgt aattgaacaa 
26461 acattaaaag atccagacta catcatcaca gtgttgactg agtataagaa agaaaaagag 
26521 caaaacttac tttcacaaca gcaagtagaa gtcaacaaac caaaagtact atccgctgac 
26SB1 ccggtagccg gcagcgataa ctcaatacct gtcggagaac cagcgaaaat acttaaacaa 
26641 aacggcgttg acataggaca aaacagattg ctcaaatggc caagaaataa tggatatctc 
26701 attaaaaaga gtggagaaag ttataactca ccaactcaaa agagtatgga tctaaaaatc 
26761 ttggatatca aaaaacgaat aattaataat ccagatggtt caagtaaagc accacgtaca 
26821 ccaaaagtaa caggcaaagg acaacaatac tttgttaata agtccttagg agaaaaacaa 
26881 acatcttaaa aggaggaaca caacggaaca aatcacatta accaaagaag agttgaaaga 
26941 aattatagca aaagaagtta gagaggccat aaacggcaag aaaccaatca gttcaggttc 
27001 aactttcagc aaagtaagaa ccaacaacga cgattcagaa gaaatcaata aaaaactcaa 
27061 tttcgcaaaa gatttgtcgc caggaagatC gaggaagctc aaccatccga ttccgctaaa 
27121 aaagtatcag catggcttcg aatcaattca tcaaaaagct tatgtacaag acgttcatga 
27181 ccatattaga aaattaacat tatcaatttt tggagcgaca cttaattcag acttgagtga 
27241 aagtgaacac aacctagcag caaaagctca ccgagaaacc aaaaactact atctatacat 
27301 ctatgaaaag agagtttcag aattaactat cgatgacttc gaataaagga ggaacaacaa 
27361 atgttacaaa aatttagaat tgcgaaagaa aaaaataaat taaaactcaa attactcaag 
27421 catgctagtc actgtctaga aagaaacaac aaccccgaae tgttgcgagc agttgcagag 
27481 ctgtcgaaaa aggctagcca aatccaacgg caaggacctg ccctgcctcc acacttagag 
27541 cttgagaccc aacaaacaca taagctctag tagggtccag aaaaaatgcc tcgatttcct 
27601 ctctcgtaac agtttcaacc ccttcatacc ccggaaaaac aattttcttt aaatccgaaa 
27661 catgcttttt tgaaccatcc tttaaagtaa ctagaagctt catacttatc acctccttag. 
27721 gttgataaca acattacaca cgaaaggagc ataaacaaca tgcaagcatc acaaacaaat 
27781 tcgaacatcg gagaaacgct caatactcaa gaaaaagaaa acggagaaat cgcaatcagc 
27841 ggtcgagaac tccatcaagc atcagaagtc aagacagcac ataaagattg gtttccaaga 
27901 acgcttaaac acggatctga agaaaataca gattacacag ctatcgctca aaaaagagca 
27961 acagctcaag gcaacatgac tcaccacatt gaccacgcac tcacactaga cactgcaaaa 
28021 gaaaccgcaa tgatccaacg tagtgaacct ggcaaacgtg caagacaata tttcatccaa 
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gttgaaaaag catggaacag cccogaaatg ottotgcaac gtgccctaaa aactgctaac 
aacacaacca atcaaccoga aacaaagact gcacgcgaca aaccaaaaat tgtatttgca 
gatgcagcag ctactactaa gacatcaatt ttagttggag agtcagcaaa gatcattaaa 
caaaacggca caaacatcgg gcaacgcaga ttgtttgagc ggccacgtca aaacggattc 
cttattaaac gcaagggtgt ggattataac atgcctacac agtattcaat ggaacgcgag 
ttattcgaaa ttaaagaaac accaatcaca cactcggacg gtcacacatc aattagcaag 
acgccaaaag taacaggtaa aggacaacaa tactctgtca acaagttttt aggagaaaaa 
caaacaactt aacaggagga attacaaacg aacgcactat acaaaacaac cctccccatc 
acaatggcag ttgtgacgtg gaaggtttgg aagattgaga agcacactag aaaacctgtg 
actagtagca gggcgttgag tgactatcta aacaacaaat cettaaccat accgaaagac 
gctgaaaacc ccaccgaacc tgctcgtegc cttttgaagt ecgccgaaca aactattagc 
aaotaacaac attatacacg aaaggaaaga tagaaacgcc aaaoatcaca gtaccaccaa 
caccagaaaa cacacacaga ggcgaagaaa aacctgtgaa aaagttatac gcaacaccta 
cacaaatcca tcaattgttt ggagtatgta gaagtacagt atacaactgg ttgaaatatt 
accgcaaaga taatttaggt gtagaaaatt catacattga ttattcacca acaggcactc 
tgattaatat ttctaaotcg gaagagtatt tgatcagaaa gcataaaaaa tggtattagg 
aggacattaa atgagcaaca tctacaaaag ccacctagca gcagtatcat gceccacagt 
cttagcgatc gtacttacgc cgtttccata cctcactaca gcatggtcaa ttgcgggatt 
cgcaagtatc gcaacattca tgtactacaa agaatgcttt ttcaaagaat aaaaaaaccg 
ctacttgttg gagcaagtaa cagtatcaaa cacttaagaa aaaattcatg ttcaacataa 
aacgaaaaac ggaggaagtc aagacgcatt acgaaatagg cgaaatcata cgcaaaaata 
ttcatgttaa cggattogat tttaagctat tcactttaaa aggtcatatg ggcataccaa 
tacaagttaa agatacgaac aacgcaccaa ttaaacatgc ttatgtcgta gacgagaatg 
acttagatat ggcatcagac ttatctaacc aagcaataga tgaatggatt gaagagaaca 
cagacgaaca ggacagacca attaacttag tcatgaaatg gtaggaggcc gctatgaagc 
agactgtaac ttatatcatt cgtcataggg acacgccaac ttatataact aacaaaccaa 
ctgacaacaa ttcagatatt agttacccca caaatagaaa tagagctagg gagtttaacg 
gtatggaaga agcgagtatc aatatggatt atcacaaagc aaccaagaaa acagtgacag 
aaactactga gtacgaggag gtagaacacg actgaggaaa aacaagaacc acaagaaaaa 
gtaagcatac tcaaaaaact aaagacaaat aacatcgctg agaaaaataa aaggaaattc 
tataaattcg cagtatacgg aaaaattggc tcaggaaaaa ccacgtttgc tacaagagat 
aaagacgctt tcgtcattga cattaacgaa ggtggaacaa cggttaccga cgaaggatca 
gacgtagaaa tcgagaacta tcaacacttt gtttatgttg taaatttttt acctcaaact 
ttacaggaga tgagagaaaa cggacaagaa accaatgttg tagctattga aaetattcaa 
aaacttagag atatgacatt gaatgacgtg acgaaaaata agtctaaaaa accaacgtet 
aatgatcggg gagaagttgc tgaacgaatt gccagtatgt acagatcaat aggaaanctt 
caagaagaat acaaattcca ctttgctact acaggtcacg aaggtaccaa caaagataaa 
gatgatgaag gtagcactat caaccctact atcactattg aagcgcaaga acaaactaaa 
aaagctatta cttctcaaag tgatgtgtta gccagggcaa tgattgaaga atttgatgat 
aacggagaaa agaaagctag atatattcta aacgctgaac cttctaatac gcttgaaaca 
aagattagac attcaccttc aataacaatt aacaataaga aatttgcaaa tcctagcatt 
acggacgtag cagaagcaat tagaaatgga aactaaaaat taattaaaag gacggtattt 
aattatgaaa atcacaggac aagcgcaatt cactaaagaa acaaaccaag aaaagtttta 
taacggctca gcagggtttc aagctggaga atccacagtg aaagctaaaa atactgaatt 
caatgataga gaaaatagat acttcacaat cgtacttgaa aatgacgaag gcaaacaata 
taaacacaac caatttgtac cgccgtataa atatgatttc caagaaaaac aaccgaccga 
attagttact cgactaggta ttaagteaaa tcttcctagc ttagatctcg ataccaatga 
tcttactggt aagccttgtc acttggtatt gaaatggaaa ttcaatgaag atgaaggtaa 
gtattttacg gatttttcat ttateaaacc ttacaaaaag ggcgatgatg ttgttaacaa 
acctattccg aagacagata agcaaaaagc tgaagaaaat aacggggcac aacaacaaac 
atcaatgtct caacaaagca atccacttga aagcagtggc caatttggat atgacgacca 
agatttagcg ttttaaggtg tggttcaaac gcaatacatt acaagacacc agaaagataa 
cgacggcact tactccgtcg tcgctactgg tgttgaactt gaacaaagtc acatcgactc 
actagaaaac ggaeatccac taaaagcaga agtagaggtt ccggacaaca aaaaactatc 
catagaacaa cgcaaaaaaa tattcgcaat gcgtagagat atagaacttc acrggggcga 
accagtagaa tcaactagaa aatcatcaca aacagaattg gaaaccatga aaggttatga 
agaaaccagt ctgcgcgact gttctatgaa agttgcaagg gagttaatag aaccgactac 
agcgctcatg etccatcatc aaatacctat gagtgtagaa acgagtaagt tgttaagcga 
agataaagcg ttateatatt gggctacaat caaccgcaac tgegtaacat gcggaaagcc 
tcacgcagac ctggcacatt atgaagcagt cggcagaggc atgaacagaa acaaaatgaa 
ccactatgac aaacacgcat tagcgttatg tcgcgaacat cacaacgagc aacatgcgat 
tggcgttaag tcgttcgacg acaaatacca ctcgcacgac tcgcggacaa aagttgatga 
gaggctcaat aaaatgttga aaggagagaa aaaggaatga atagactaag aataacaaaa 
atagcacccc taatcgtcat cttggcggaa gagactagaa atgccatgca tgctgtaaaa 
gtggagaaaa ttttaaaatc tccgtttagt taatacaggt ttctacaaaa gctttaccac 
aggcggacaa actaattgag ccttttttga tgcctactac ccaggggctg caatgcaacc 
ttaatacttc aaatccaatg ccagaaagtt tacttatcgc tcctaggttg tgtcctgact 
ttaacattct tttaacaaat tctaatcccg aaacaaatcc tcgtttttct ataatcttat 
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taoagcgact taaaaactga ggagcacaaa acttattata aattcccttt cttgttaagt 
aagacacgcc aaaagtttca tttaaaaccc ctaacctcac taggtcatta attgaaactc 
cggttgattc tatatctaac ggagagtctt ttattaacgt gtccgataca tccataccgt 
cactctctgg gtttaaaacc gccctatatt taacggcagg atgtactteg tgattcttta 
aatgttttaa aagaatagca ccatttgggg ataattgttc aattatctca acaaacgaat 
ggtgggttaa tgagtttttc ccgtcatcca cagatgatgc tattagtttt gcgaacatat 
tacttaaagt tttttcacta atgtaaaact ttgaagcctc tagagcagga cctagaagag 
aaaattgtgg ttcccgtaaa ttattettag gcacagaaga tatttctttt tcaaattgtt 
ctttgaattt ctcaaactct acttctcttt gataaataac tttatccaca caaaggcgga 
atttcccaaa gacaagttcc caagttttag agaatgtctc tacaggcccc tctgatgcgc 
ccccaataac tttatcaata cetttaccta aaataggatc cataattatt cacccccaac 
ctaacgcaat agcgataaca aaattatacc agaaaggaga atcaacatga ctgaccaacc 
aagttactac tcaataatta cagcaaatgt cagacacgat aaccgactta ctgacagcga 
aaagttactt tttgcagaaa taacatcttt aagtaacaaa tacggatacc gcacagcaag 
caatggtcac tttgcaactt tatacaacgt tgttaaggaa actatatctc gtagaatttc 
gaaccttacc aactttggtt atctaaaaat cgaaattatc aaagaaggta atgaagtcaa 
acaaaggaag atgtacccct tgacgcaaac gccaatacct attgacgcaa aaatcaatac 
ccctattgat aattctgtca atacccctat tgacgcaaat gtcaaagaga atattacaag 
tattaataat acaagcaaca acaatataaa tagaatagat atattgtcgg gcaacccgac 
agcatcttcc ataccctata aagaaateat cgattactta aacaaaaaag cgggcaagca 
ttttaaacac aatacagcta aaacaaaaga ttttattaaa geaagatgga accaagattt 
taggtcggag gattttaaaa aggtgattga tatcaaaaca gctgagtggc taaacacgga 
tagcgataaa taccttagac cagaaacact ttttggcagt aaatttgagg ggcacctcaa 
ccaaaaaata caaccaaetg gcacggacca attggaacgc atgaagtacg acgaaagtta 
ttgggatcag ggggatatta tgaaaccact attcagcgaa aagacaaacg aaagcttgaa 
aaaatatcaa cctactcatg tcgaaaaagg attgaaatgt gagagatgtg gaagtgaata 
cgacttatat aagtttgctc ctactaaaaa acaccogaat ggctacgagt ataaagacgg 
ccgcaaacgt gaaatctatg aggaacataa gcgaaacaag caacggaaga caaacaacat 
atccaatcaa tcaaacgtta atccgccttt aagagatgca acagtcaaaa actacaagcc 
acaaaatgaa aaacaagtac acgccaaaca aacagcaaca gagtacgtac aaggcctctc 
tacaaaagaa ccaaaatcac taatactgca aggttcacac ggaactggta aaagccacct 
agcatacgct atcgcaaaag cagtcaaagc taaagggcat acggttgctt ttatgcacat 
accaatgttg atggaccgca tcaaagcgac atacaacaaa aatgcagtag agactacaga 
cgagccagtc agattgctaa gtgatattga tttacttgta ccagatgata tgggtgtaga 
aaacacagag cacactctaa ataaactttt cagcattgtt gataacagag taggtaaaaa 
caacatcttt acaactaacc ttagtgataa agaactaaat caaaatatga actggcaacg 
tataaattcg agaacgaaaa aaagagcaag aaaagtaaga gtaatcggag acgatttcag 
ggagcgagat gcacggtaac caaagaattt ccaaaaacta aacttgagtg ctcagatatg 
tacgctcaga aactcataga tgaggcacag ggcgatgaaa ataggttgta cgacctattt 
acccaaaaac ttgcagaacg tcatacacgc cccgctaccg tcgaatacta aggagtgtta 
aaaatgccga aagaaaaata ttacttatac cgagaagatg gcacagaaga tattaaggtc 
atcaagtata aagacaacgt aaacgaggtt tattcgctca caggagccca tttcagcgac 
gaaaagaaaa ctatgaccga tagtgaccta aaacgattca aaggcgctca cgggcttcta 
tatgagcaag aactaggttc acaagcaacg atatttgata tttagaggtg gacgatgagt 
aaatacaacg ctaagaaagt tgagcacaaa ggaattgtat ttgacagcaa agtagagtgt 
gaatattacc aatatttaga aagtaatatg aatggcacta attatgatca tatcgaaata 
caaccgaaac tcgaattatt accaaaacca gataaacaac gaaagactga atatattgca 
gacttcgcgt tatatctcga tggcaaactg attgaagtta tcgacattaa aggtatgcca 
accgaagtag caaaacttaa agctaagact ttcagacata aatacagaaa cataaaactc 
aattggatat gtaaagcgcc taagtataca ggtaaaacat ggattacgta cgaggaatta 
attaaagcaa gacgagaacg caaaagagaa atgaagtgat ctaacgcaac aacaagcata 
tataaacgca acgattgata taaggacacc tacagaagtt gaatatcagc attttgatga 
tgtggataaa gaaaaagaag cgccggcaga tcacttatat aacaatcccg acgaaatact 
agagtatgac aacctaaaaa ttagaaacgt aaacgtagag gtggaataaa cgggcagtgt 
tgcaaccatc aacaataaac catataaatt taacaatttt gaaaaaagaa ataacggcaa 
agcgtgggat aaatgccgga atcgtttcta aacgtgtcag aggttgttgg gagttttcag 
aagctttaga cgcgccttat ggcatgcacc taaaagaata tagagaaatg aaacaaatgg 
aaaagattaa acaagcgaga ctcgaacgtg aaccggaaag agagcgaaag aaagaggctg 
agctacgtaa gaagaagcca catttgttca atgtacctca aaaacattca cgtgatccgt 
actggttcga tgtcacttat aaccaaatgt tcaagaaacg gagtgaagca taatgagcat 
aatcagtaac agaaaagtag atatgaacaa aacgcaagac aacgttaagc aacctgcgca 
ttacacatac ggcgacattg aaactacaga ctttatcgaa caagttacgg cacagtaccc. 
accacaatta gcattcgcaa caggtaatgc aattaaatac ttgtctagag caccgttaaa 
gaatggtcac gaggatttag caaaggcgaa gtttcacgcc gatagagtat ttgacttgtg 
ggagcgatga ccatgacaga tagcggacgt aaagaatact taaaacacct tttcggctct 
aagagatatc tgtatcagga taacgaacga gtggcacata tccacgtagt aaatggcact 
tat tact ttc acggtcatat cgtgccaggt tggcaaggtg tgaaaaagac atttgataca 
gcggaagagc ttgaaacata cataaagcaa agtgatttgg aatatgagga acagaagcaa 
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ctaeccttat ttcaaaaggg cggaaacaat gaaaatcaoa actgaaaaag aaatgaatct 
acctgaactt atccaatggg ctcgggataa ccccaagtta tcaggtaata aaagattcta 
ttcaaatgat gttgagcgca actgttttgt gacttttcat gttgatagca tcttatgtaa 
cgcgactgga cacgtatcaa teaacgataa atttactgtt caagaggaga tataacaatg 
aaaaccaaag ttaaaaaaga aaegagatca gatgaattaa ctaaatgggc gcgagaaaac 
ccggatctat cacaaggaaa aatatttttt tcaacaggac ctagcgatgg attcgttcgt 
cttcatccaa acacaaataa gcgttcgacg ccaagtttta ctccaatcga tatccccttc 
atagttgata tcgaaaaaga agcaacggaa gagactaagg ctgataggtt gattgaatta 
ctcgagattc aagaaggaga ctataactct acactatatg agaacactag tataaaagaa 
egtttatatg gcagacgtgt gcccaccaaa gcattctaca ccttaaacga tgacccaact 
atgacgtcaa tctggaaaga tggggagttg ctagtatgat getgaaattt aaagctcggg 
ataaagacaa aaaagccatg agtaccatcg acgaaaccga tttcaatagt gggtacactt 
tgatctcaac aggttataaa agttccaatg aagtaaaact attacaacac acaggattta 
aagatgtgca cggtgtggag atttatgaag gggatattgc tcaagattgc tattcgagag 
aagtaagttt tatcgagttt aaagaaggag cctcttacat aacttttagc aatgcaactg 
aattactaag tgaaaacgac gacattattg aaattgttgg aaatattttt gaaaatgaga 
tgctattgga ggttatgaga tgacgttcac cttatcagat gaacaatata aaaatictttg 
tactaactct aacaagttat tagataaact tcacaaagca ttaaaagatc gtgaagagca 
caagaagcaa cgagatgagc ttattgggga tatagcgaag ttacgagatt gtaacaaaga 
cctagagaag aaagcaagcg catgggatag geattgeaag agcgtcgaaa aagatttaat 
aaacgaattc ggtaacgatg atgaaagagt caaactcgga acggaattaa acaataaaat 
ctctatggag gatgacacaa atgaacaatc gcgaaaaaat cgaacagtcc gttattagtg 
ctagtgcgta caacggtaat gacacagagg ggttgctaaa agagattgag gacgtgtata 
agaaagcgca agcgtttgat gaaatacttg agggaatgac aaatgctatt caacattcag 
ctaaagaagg cactgaacct gatgaagcag tagggattat ggcaggtcaa gttgtctata 
aacacgagga ggaataggaa aatgaccaac acattacaag taaaactatt atcaaaaaat 
gccagaatgc ccgaacgaaa tcataagacg gatgcaggtt atgacacact cccagctgaa 
actgtcgtac tcgaaccaca agaaaaagca gtgatcaaaa cagatgtagc tgtgagtata 
ccagagggct atgtcggact attaactagt cgtagtggtg taagtagtaa aacgtattta 
gtgategaaa caggcaagat agacgcggga taccacggca atttagggat taatatcaag 
aacgatgaag aacgtgatgg aatacccttt ttatatgatg atatagacgc tgaattagaa 
gatggattaa caagcattct agatataaaa ggtaactatg tacaagatgg aagaggcata 
agaagagttt accaaatcaa caaaggcgat aaactagctc aattggtcat cgtgcctaca 
fcggacaccgg aaccaaagca agtggaggaa ttcgaaagtg tttcagaacg tggagcaaaa 
ggcttcggaa gtagcggagt gtaaagacat cttagatcga gttaaggagg ttttggggae 
gcgacgcaac acttagtcac aacattcaaa gattcaacag gacgaccaca tgaacatacc 
actgtggcta gaga taa tea gaegtttaca gttattgagg cagagagtaa agaagaagcg 
aaagagaagt acgaggcaca agtcaaaaga gatgeagcta ttaaagtggg tcagttgtat 
gaaaatataa gggagtgtgg gaaacgaegg atgttaaaat taaaactatt teaggeggag 
tttattttgt aaaaacagct gaaccttttg aaaaatatgt tgaaagaatg acgagcttta 
atggttatat ttaegcaage actataatca agaaaccaac gtatattaaa acagatacga 
ttgaatcaac cacacttatt gaggagcatg ggaaatgaac cagctgagaa ttttattaca 
egaeggtage agtctgatat tacatgaaga tgaattattt aacgaaatag tatttgtttt 
ggacaacttt agaaatgatg atgactattt aacgatagaa aaagattatg gcagagaacc 
tgtaetgaac aaaggctata tagttgggat caatgttgag gaggcagatg acgatcaaca 
tacctaaaat gaaatccccg aaaaagtaca ctgaaataat caaaaaatat aaaaacaaag 
cacctgaaga aaaggctaag attgaagatg attttattaa agaaaccaaa gataaagaca 
gtgaatctta cagtcctacg atggctaata tgaatgaata tgaattaagg gctatgttaa 
gaatgatgee tagtttaatt gatactggag atgacaatga tgatcaaaaa acttaaaaat 
atggatgggt ccgacatctt tattgttgga atacegtcat catteggtat attegcattg 
ctacttgtta tcacattgcc tatctataca gtggctagtt accaacacaa agaactacat 
caaggaacta ttacagataa acataacaag agacaagata aagaagacaa gttctatatt 
gtattagaca acaaacaagt cattgaaaat cccgacttat cattcaaaaa gaaatttgat 
agegcagata tacaagctag gttaaaagta ggegataagg cagaagctaa aacaateggt 
tatagaatac actttttaaa tttatatccg gtcttatacg aagcaaagaa ggtagataaa 
caatgattaa acaaatacta agaccattat tcttactagc aatgtatgag ttaggcaagt 
atgtaactga gcaagtgtat attatgacga eggctaatga tgaegtagag gcgccgagcg 
ateaegtett tegageggag gtgagtgaat aotgagaata tttatttatg atttgatcgt 
tttgctgttt gctttcttaa tatccatata tattattgat gatggagtga taataaatgc 
attaggaatt tttggtatgt ataaaattat agattccttt tcagaaaaca ttataaagag 
gtagataaaa atgaacgagc aaataatagg aagcatatat actttagcag gaggegcegt 
gctttattca gtcaaagaga cccccaggta ttttacagac cctaacttac aacgtaaaaa. 
aatcaactta gaacaaatat atccgatata tttagattgt tttaaaaagg ctaaaaagat 
gattggagct tatattattc caacagaaca gcacgaattt ttagattttt tcgatattga 
agtcttcaat aattcagata agcaaagtaa aaaagegtat gaaaatgtta ttggatttag 
acaaatgatt aatttatcaa atagagttaa ggcaatggaa gattttaaga tgagcttcaa 
caatgaatct agtacaaatc agattttctc taatccttct tttgttatgg aaacaattgc 
cactataaac gaatatcaaa aagatatatc ttatttaaaa aatataatta ataaaatgaa 
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40321 tgaaaataga gcctacaatc atattgacag ttttatcact ccagagtacc gacgaaaaat 

40381 aaacgatcac aacccctatc etgataaatt cgaagaacag tttagtcaaa agtttaaaat 

40441 aaacagaacc tcgataaaag aaagaattat tattaattta aacaagagga gactcaaatg 

40S01 acgtggotta ctatgactat tgtatttgct atattgctae tagtttgcat cagtattaat 

40561 agtgatcgcg caagagagat acaagcacct agatatacga acgattatct acccgacgaa 

40621 gtagttaaaa ctaaagggta caacgggcta gaagaataca ggattgaacc gaagcgaacg 

40661 aataacgaca ttaaaaagta atttatatta ccggaggtat tgcatcgaat gataaagatt 

40741 gagaaacacg atatcaaaaa gcttgaagaa tacattcagc acatcgacaa ctatcgaaga 

40801 gagttgaaga tgcgagaata tgaattactt gaaagccatg aaccagaeaa tgcgggagct 

40861 ggcaaaagta atctgccggg taacccgact gaacgacgtg caacaaagaa gtttagtgat 

40921 aacaggtaca atacattaag aaatatagtt aacggcgtag atagattgat aggtgaaagt 

40981 gacgaggata cgcttgagtt actaaggttt agatattggg attgtcctat tggttgttat 

41041 gaatgggaag atatagcaca ctactttggt acaagcaaga caagtatatt acgtogaagg 

41101 aatgcaccga tcgataagct agcaaagtat attggctatg tgtagcggac tcttacccca 

41161 tgtaagtccg catcaaaaca gtttattatg tcagtatcag attaatattc aaagttatta 

41221 aatgctaaca cgacgcatga acaagaggcg catcactacg tgatgtgtcc ttttacttat 

41281 gaggtatgaa catgttcaaa ctaaetgtaa atacactact acacatcaag tatagatgag 

41341 tcttgatact acttaagtta tataaggtga aacattatga tgactaaaga cgaacgcata 

41401 cgactctata agtctaaaga atggcaaata acaagaaaaa gagtgctaga aagagataac 

41461 tacgaatgtc aacaatgtaa gagagacggc aagttaacga cacacgacaa aagcaagcgt 

41521 aagtcgttgg atgtagatca tatatcaccg ctagaacatc atccggagtt tgctcacgac 

41581 tcaaacaatt tagaaacact gtgtattaaa tgtcacaaca aaaaagaaaa gagatttata 

41641 aaaaaagaaa ataaatggaa agacgaaaaa cggtaaacac ccccgggtca aaaaaatcaa 

41701 aagcgatc 
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48 
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49 
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50 


77ORF054 
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29574.-29795 


8 


77ORF012 
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56 


77ORF066 


27494-27703 


10 


77ORF014 


27760-28512 


57 


77ORF069 


38341..38547 


11 


77ORF015 


3291. .4028 


58 


77ORF070 


36269.-36475 


12 


77ORF016 


32867..33610 


59 


77ORF071 


40498..40701 


13 


. 77ORF017 


23269.-23982 


60 


77ORF072 


38735.J8938 


14 


77ORF018 


31169-31840 


61 


77ORF073 


30945..3U48 


15 


77ORF019 


39851. .40501 


62 


77ORF074 


38544.-38738 


16 


77ORF020 


6926..7570 


63 


77ORF075 


13673..13870 


17 


77ORF021 


37762-38304 


64 


77ORF077 


25357-25605 


18 


77ORF022 


30605..31156 


65 


77ORF079 


29089-29280 


19 


77ORF023 


26903..27346 


66 


77ORF080 


35204..35389 


20 


77ORF024 


10700..11140 


67 


77ORF085 


24060-24242 


21 


77ORF025 


9707.. 10147 


68 


77ORF092 


39706..39876 


22 


77ORF026 


40729..41145 


69 


77ORF094 


32226-32393 


23 


77ORF027 


6518..6925 


70 


77ORF096 


13606..13773 


24 


77ORF028 


34795-35199 


71 


770RF098 


7092-7256 


25 


77ORF029 


6117..6521 


72 


77ORF102 


29051-29212 


26 


77ORF030 


36478..36879 


73 


77ORF104 


34393-34551 


27 


77ORF031 


39151..39546 


74 


77ORF109 


18282-18434 


28 


77ORF032 


33892.-34266 


75 


770RF112 


39543-39692 


29 


77ORF033 


5758..6120 


76 


770RF117 


27361-27501 


30 


77ORF034 


7886..8236 


77 


770RF118 


38390-38530 


31 


77ORF035 


19258..19560 


78 


77ORF120 


36059..36199 


32 


77ORF036 


36876.-37223 


79 


770RF124 


33699-33833 


33 


77ORF037 


102..446 


80 


770RF128 


14221-14355 


34 


77ORF038 


34908-35219 


81 


77ORF130 


15675-15806 


35 


77ORF039 


37220..37528 


. 82 


770RF133 


8414..8542 


36 


77ORF040 


41377..41676 


83 


77ORF140 


13113-13235 


37 


77ORF041 


35454..3S753 


84 


770RF147 


7029-7148 


38 


77ORF042 


5490..5774 


85 


770RF149 


30668..30787 


39 


77ORF043 


29304..29564 


86 


770RF151 


31837..31953 


40 


77ORF044 


18481..18768 


87 


770RF155 


30278-30391 


41 


77ORF045 


5216..5500 


88 


770RF157 


4044-4157 


42 


77ORF046 


25663-25935 


89 


770RF167 


20692-20799 


43 


77ORF047 


1U59..U425 


90 


770RF175 


35717..35821 


44 


77ORF048 


28776..29039 


91 


770RF176 


6836-6940 


45 


77ORF049 


36013-.36255 


92 


770RF178 


35390-35491 


46 


77ORF050 


35753..36007 


93 


770RF179 


8318-8419 


47 


77ORF051 


38931..39167 


94 


770RF182 


29268-29564 



50 



55 
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Table 4 

77ORF01 7 sequence 

23982 atgacgcataatatagaaaaacgcattaataaattaaaaacttct 

1 MTHN IEKRINKLKTS 

23937 ggaaatccaaaatttaaaaagttagattcagatattcactattta 

16 GNPKFKKLDSDIHYL 

23892 ctcaagagatttgaaggtgaaaaaaaccataaaggtttttatcca 

31 LKRFEGEKNHKGFYP 

23847 aagtttaaacaaggagaaatagtttttgtagatttcggtataaac 

46 KFKQGEIVFVDFGIN 

23802 gttaataaagaattttctaattcacactttgcaatagtgatgaat 

61 VNKEFSNSHFAIVMN 

23757 aaaaatgattctaatacggaggatatagtaaatgttattccctta 

76 K NDSNTE DIVNVIPL 

23712 tcctctaaagaaaacaaaaagcatttaaagatgaattttgatttg 

91 S-SKENKKYLKMNFDL 

23667 aaatgggagtattatttaagattgtttttaaatttaattagcgcg 

106 KWE YYLRL FLNL I S A 

23622 caaaataattcagctatattaaaagaagttttcgataaaaaatac 

121 QNNSAILKEVFDKKY 

23577 caaaaaaacaacacagaattcatcactaaagattattttattgaa 

136 QKNNTEF ITKDYFI E 

23532 tttatatctgatagtttagaaattgaaaataaattaaataaaatt 

151 F I S DS LE I EN KLNKI 

23487 gacagaaacattaataacatagtatcagcaattgataaggtaaaa 

166 DRN INNIVSAIDKVK 

23442 aaattaaaaggtaatagttacgcttgcataaattctttccagccg 

181 KLKGNSYACINSFQP 

23397 atcagtaagtttcgcataagaaaagttttaccccaaaaaattaaa 

196 I SKFR IRKVLPQKI K 

23352 aatccagtaatagattcttcggatattatgttactgataaataga 

211 NPVIDSSDIMLLINR 

23307 attaataataatatattgcagatccctgatataagatga 23269 

226 INNNILQIP DIR* 
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Physico-chemical parameters of ORF 77ORF017 

1 MTHNIEKRIN KLKTSCNPKF KKLDSDIHYL LKRFEGBKNH KGFYPKFKQG EIVPVDFGIN 

61 VMKBPSNSHF AIVMNXNDSN TEDIVNVIPL SSKENKKYLK MNFDLKWEYY LRLFLNLISA 

121 QNNSAILKEV FDKKYQKKNT EFITKDYFIE FISDSLEIEN KLNKIDRNIN NIVSAIDKVK 

181 KLKGNSYACI NSFQPISKFR IRKVLPQKIK NPVIDSSDIM LLINRINNNI LQIPDIR 

Number of amino acids: 
Average molecular weight (Daltons): 
Mean amino acid weight (Daltons): 
Monoisotopic molecular weight (Daltons): 
Mean amino acid monoisotopic weight (Daltons): 

Amino acid composition 



Aci 
d 


Symbo 


Numb 
er 


% 


Average % 
Id Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


5 


2.11% 


7,58% 


Cys 


C 


1 


0.42% 


1.66% 


Asp 


D 


14 


5.91% 


5.28% 


Glu 


E 


13 


5.49% 


6.37% 


Phe 


F 


16 


6.75% 


4.09% 


Gly 


G 


6 


2.53% 


6.84% 


His 


H 


4 


1.69% 


2.24% 


lie 


I 


29 


12.24 

% 


5.81% 


Lys 


K 


33 


13.92 
% 


5.95% 


Leu 


L 


19 


8.02% 


9.42% 


Met 


M 


4 


1.69% 


2.37% 


Asn 


N 


30 


12.66 

% 


4.45% 


Pro 


P 


7 


2.95% 


4.9% 


Gin 


9 


6 


2.53% 


3.97% 


Arg 


R 


8 


3.38% 


5.16% 


Ser 


s 


17 


7.17% 


7.12% 


Thr 


T 


5 


2.11% 


5.67% 


Val 


V 


11 


4.64% 


6.58% 




W 


1 


0.42% 


1.23% 


Tyr 


Y 


8 


3.38% 


3.18% 



Number of acidic (negative) amino acids (ED): 


27 




11.39% 


Number of basic (positive) amino acids (KR): 


41 




17.30% 


Total charge (KRED): 


68 




28.69% 


Net charge (KR- ED): 


14 




5.91% 


Theoritical pi: 


10.01 


Total linear charge density: 


0.30 


Average hydrophoblcity: 


-5.37 


Ratio of hydrophilicity to hydrophoblcity: 


1.41 


Percentage of hydrophilic amino acid: 


57.81% -:-- 


Percentage of hydrophobic amino acid: 


42.19°,£ 


Ratio of %hydrophilic to %hydrophobic: 


1.37 



237 

2788738 
117.67 
27869.83 
117.59 
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77ORF01 9 sequence 


39851 




atgaacgagcaaataataggaagcatatatactttagcaggaggt 


1 M 


N 


E Q I I G S I Y T L A G G 


39896 




gttgtgctttattcagttaaagagatttttaggtattttacagat 


16 V 


V 


LYSVKEIFRYFT.D 


39941 




tctaacttacaacgtaaaaaaatcaatttagaacaaatatatccg 


31 S 


N 


LQRKKI NLEQ IYP 


39966 




atatatttagattgttttaaaaaggctaaaaagatgattggagct 


46 I 


Y 


ldcfkkakkmiga " 


40031 




tatattattccaacagaacagcatgaatttttagatttttttgat 


6i y 


I 


I PTEQHEFLDFFD 


40076 




attgaagtctttaataatttagataagcaaagtaaaaaagcgtat 


76 I 






40121 




gaaaatgttattggatttagacaaatgattaatttatcaaataga 


91 E 


N 


VIGFRQMI N L S N R 


40166 




gttaaggcaatggaagattttaagatgagtttcaacaatgaattt 


106 . V 


K 


AMEDFKMS FNNEF 


40211 




agtacaaatcagattttttttaatccttcttttgttatggaaaca 


121 S 


T 


NQIFFNPSFVMET 


40256 




attgctattataaatgaatatcaaaaagatatatcttatttaaaa 


136 I 


A 


I INEYQKDISYLK 


40301 




aatataattaataaaatgaatgaaaatagagcttataatcatatt 


151 N 


I 


INKMNENRAYNH I 


40346 




gatagttttatcacttcagagtaccgacgaaaaataaacgattat 


166 D 


S 


FITSEYRRKINDY 


40391 




aatctttatcttgataaatttgaagaacagtttagtcaaaagttt 


181 N 


L 


YLDKFEEQFSQKF 


40436 




aaaataaacagaacttcgataaaagaaagaattattattaattta 


196 K 


I 


NRTSIKERIIINL 


40481 




aacaagaggagatttaaatga 40501 


211 N 


K 


R R F K * 
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Physico-chemical parameters of ORF 77ORF019 

1 MNEQIIGSIY TLAGGWLYS VKBIFRYFTD SNLQRKKINL EQIYPIYLDC FKKAKKMIOA 

61 YIIPTEQHEF LDPFDIEVPN NLDKOSKKAY ENVIGFRQMI NLSNRVKAME DFKMSFNNEP 

121 STNQIFPNPS FVMETIAIIN EYQKDISYLK NIINKMNENR AYNHIDSFIT SEYRRKINDY 

181 NLYLDKFEEQ FSOKFKINRT SIKERIIIML NKRRFX 



Number of amino acids: 2 1 6 

Average molecular weight (Daltons): 26026.06 

Mean amino acid weight (Daltons): 120.49 

Monoisotopic molecular weight (Daltons): 26009.34 

Mean amino acid monoisotopic weight (Daltons): 120.41 



Amino acid composition 



Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


7 


3.24% 


7.58% 


Cys 


C 


1 


0.46% 


1.66% 


Asp 


D 


10 


4.63% 


5.28% 


Glu 


E 


16 


7.41% 


6.37% 


Phe 


F 


19 


8.80% 


4.09% 


Gly 


G 


5 


2.31% 


6.84% 


His 


H 


2 


0.93% 


2.24% 


lie 


I 


28 


12,96 
% 


5.81% 


Lys 


K 


22 


10.19 

% 


5.95% 


Leu 


L 


12 


5.56% 


9.42% 


Met 


M 


7 


3.24% 


2.37% 


Asn 


N 


23 


10.65 

% 


4.45% 


Pro 


P 


3 


1.39% 


4.9% 


Gin 


9 


10 


4.63% 


3.97% J 


Arg 


R 


11 


5.09% 


5.16% 


Ser 


s 


13 


6.02% 


7.12% 


Thr 


T 


7 


3.24% 


5.67% 


Val 


V 


7 


3.24% 


6.58% 


Trp 


W 


0 


0.00% 


1.23% 


Tyr 


Y 


13 


6.02% 


3.18% 1 



Number of acidic (negative) amino acids (ED): 


26 




12.04% 


Number of basic (positive) amino acids (KR): 


33 




15.28% 


Total charge (KRED): 


59 




27.31% 


Net charge (KR - ED): 


7 




3.24% 


Theoritical pi: 


9.52 


Total linear charge density: 


0.28 


Average hydrophobicity: 


-4.84 


Ratio of bydrophilicity to hydrophobicity: 


1.37 


Percentage of hydrophiiic amino acid: 


54.17% 


Percentage of hydrophobic amino acid: 


45.83% _ 


Ratio of %bydrophilic to %hydrophobic: 


1.18 - 
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77ORF043 sequence 



10 



29304 




atgtattacgaaataggcgaaatcatacgcaaaaatattcatgtt 


1 M 


Y 


YEIGEIIRKNIHV 


29349 




aacggattcgattttaagctattcattttaaaaggtcatatgggc 


16 N 


G 


FDFKLFILKGHMG 


29394 




atatcaatacaagttaaagatatgaacaacgtaccaatcaaacat 


31 I 


S 


IQVKDMNNVP IKH 


29439 




gcttatgtcgtagatgagaatgacttagatatggcatcagactta 


46 A 


Y 


VVDEN DLDMASDL 


29464 




tttaaccaagcaatagatgaacggattgaagagaacacagacgaa 


61 F 


N 


QAIDEWIEENTDE 


29529 




caggacagactaattaacttagtcatgaaatggtag 29564 


76 Q 


D 


RLINLVMKW* 



20 



25 



30 



35 



40 



45 



55 
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Physico-chemical parameters of ORF 77ORF043 

1 MYYEIGEIIR KNIHVNGFDP KLFILXGHMG ISIQVKDHNN VPIKHAYWD ENDLDMASDL 

61 FNOAIDKWTE ENTDEQDRLI KLVMKN 



Number of amino adds: 86 

Average molecular weight (Daltons): 10186.68 

Mean amino acid weight (Daltons): 1 1 8.45 

Monoisotopic molecular weight (Daltons): 10180.02 

Mean amino acid monoisotopic weight (Daltons): 1 1 8.37 



Amino acid composition 



Aci 
d 


Syrabo 
1 


Numb 
er 


% 


Average % 
in Swiss prot 


Aci 
d 


Syrabo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.49% 


7.58% 


Cys 


C 


O 


0.00% 


1.66% 


Asp 


D 


9 


10.47 

% 


5.28% 


Glu 


E 


7 


8.14% 


6.37% 


Phe 


F 


4 


4.65% 


4.09% 


Gly 


G 


4 


4.65% 


6.84% 


His 


H 


3 


3.49% 


2.24% 


He 


I 


11 


12.79 
% 


5.81% 


Lys 


K 


6 


6.98% 


5.95% 


Leu 


L 


6 


6.98% 


9.42% 


Met 


M 


5 


5.81% 


2.37% 


Asn 


N 


8 


9.30% 


4.45% 


Pro 


P 


t 


1.16% 


4.9% 


Gin 


Q 


3 


3.49% 


3.97% 


Arg 


R 


2 


2.33% 


5.16% 


Ser 


s 


2 


2.33% 


7.12% 


Thr 


T 




1.16% 


5.67% 


Val 


V 


6 


6.98% 


6.58% 


Trp 


W 


2 


2.33% 


1.23% 


Tyr 


Y 


3 


3.49% 


3.18% | 



Number of acidic (negative) amino acids (ED): 


16 




18.60% 


Number of basic (positive) amino acids (KR): 


8 




9.30% 


Total charge (KRED): 


24 




27.91% 


Net charge (KR - ED): 


-8 


9.30% 




Tbeoritical pi: 


4.38 


Total linear charge density: 


0.30 


Average bydrophobicity: 


-2.80 


Ratio of bydrophilicity to bydrophobicity: 


1.19 


Percentage of hydrophllic amino acid: 


48.84% 


Percentage of hydrophobic amino acid : 


51.16% 


Ratio of %hydrophilic to %hydrophobic: 


0.95 
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770RF1 02 sequence 

29051 atgagcaacatttataaaagctacctagtagcagtattatgcttc 
1 MSNIYKSYLVAVLCF 
29096 acagtcttagcgattgtacttatgccgtttctatacttcactaca 
16 TVLAIVLM PFLYFT T 
29141 gcatggtcaattgcgggattcgcaagtatcgcaacattcatgtac 
31 AWS IAGFAS IAT FMY 
29186 tacaaagaatgctttttcaaagaataa 29212 

46 YKECFFKE * 



WO 00/32825 



PCT/IB99/02040 



160 

Physico-chemical parameters of ORF 77ORF102 

1 MSNIYKSYLV AVLCFTVLAI VLMPFLYPTT AWSIAGFASI ATFMYYKECP PKK 



Number of amino acids: S3 

Average molecular weight (Daltons): 6155.42 

Mean amino acid weight (Daltons): 116.14 

Monoisotopic molecular weight (Daltons): 6 1 5 1 .07 

Mean amino acid monoisotopic weight (Daltons): ] 16.06 



Amino acid composition 



Aci 
d 


Symbo 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


6 


11.32 
% 


7.58% 


Cys 


C 


2 


3.77 
% 


1.66% 


Asp 


D 


0 


0.00% 


5.28% 


GIu 


E 


2 


3.77 
% 


6.37% 


Phe 


F 


7 


13.21 
% 


4.09% 


Gly 


G 


1 


1.89 

% 


6.84% 


His 


H 


0 


0.00% 


224% 


lie 


I 


4 


7.55 
% 


5.81% 


Lys 


K 


3 


5.66% 


5.95% 


Leu 


L 


5 


9.43 
% 


9.42% 


Met 


M 


3 


5.66% 


2.37% 


Asn 


N 


1 


1.89 

% 


4.45% 


Pro 


P 


I 


1.89% 


4.9% 


Gin 


Q 


0 


0.00 

% 


3.97% 


Arg 


R 


0 


0.00% 


5.16% 


Ser 


s 


4 


7.55 
% 


7.12% 


Thr 


T 


4 


7.55% 


5.67% 


Val 


V 


4 


7.55 
% 


6.58% 


Tip 


W 




1.89% 


1.23% . 


Tyr 


Y 


5 


9.43 
% 


3.18% 



Number of acidic (negative) amino acids (ED): 


2 




3.77% 


Number of basic (positive) amino acids (KR): 


3 




5.66% 


Total charge (KRED): 


5 




9.43% 


Net charge (KR - ED): 


1 




1.89% 


Tbeoritical pi: 


8.18 


Total linear charge density: 


0.13 


Average hydrophobichy: 


10.8 f 


Ratio of hydrophilicity to bydropbobicity; 


0.40 


Percentage of hydrophilic amino acid: 


28.30% 


Percentage of hydrophobic amino acid: 


71.70% 
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Ratio of %bydrophi!ic to %hydrophobic: 0.39 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 
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770RF1 04 sequence 

34393 atggtaaccaaagaatttttaaaaactaaacttgagtgttcagat 
1 MVTKB FLKTKLECSD 
34438 atgtacgctcagaaactcatagatgaggcacagggcgatgaaaat 
16 MYAQK.LIDEAQGDEN 
34483 aggttgtacgacctattcatccaaaaacttgcagaacgtcataca 
31 RLYDLFIQKLAERHT 
34528 cgccccgctatcgtcgaatattaa 34551 
46 RPAIVEY* 
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Physico-chemical parameters of ORF 77ORF104 

1 MVTKEFLKTK LECSDMYAQK LIDEAQGDEN RLYDLFIQKL AERHTRPAIV EY 



10 Number of amino acids: 52 

Average molecular weight (Dal tons): 6193.13 

Mean amino acid weight (Dal tons): 119.10 

Monoisotopic molecular weight (Daltons): 6189.12 

Mean amino acid monoisotopic weight (Daltons): 1 19.02 

? 5 Amino acid composition 



20 



25 



30 



Aci 
d 


Symbo 
I 


Numb 
er 


% 


Average % 
In Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


4 


7.69 
% 


7.58% 


Cys 


C 


1 


1.92% 


1.66% 


Asp 


D 


4 


7.69 
% 


5.28% 


Glu 


E 


6 


11.54 

% 


6.37% 


Phe 


F 


2 


3.85 

% 


4.09% 


Gly 


G 


I 


1.92% 


6.84% 


His 


H 


1 


1.92 

% 


2.24% 


lie 


I 


3 


5.77% 


5.81% 


Lys 


K 


5 


9.62 
% 


5.95% 


Leu 


L 


6 


11.54 

% 


9.42% 


Met 


M 


2 


3.85 

% 


2.37% 


Asn 


N 


1 


1.92% 


4:45% 


Pro 


P 


1 


1.92 
% 


4.9% 


Gin 


Q 


3 


5.77% 


3.97% 


Arg 


R 


3 


5.77 
% 


5.16% 


Ser 


s 


I 


1.92% 


7.12% 


Thr 


T 


3 


5.77 
% 


5.67% 


Val 


V 


2 


3.85% 


6.58% 


Trp 


W 


0 


0.00 

% 


1.23% 


Tyr 


Y 


3 


5.77% 


3.18% 



40 



45 



50 



Number of acidic (negative) amino acids (ED): 


10 




19.23% 


Number of basic (positive) amino acids (KR): 


8 




15.38% 


Total charge (KRED): 


18 




34.62% 


Net charge (KR - ED): 


-2 


3.85% 




TheoriticaJ pi: 


5.03 


Total linear charge density: 


0.38 


Average nydropbobicity: 


-5.81 


Ratio of bydropbilicity to hydrophobicity: 


1.47 


Percentage of bydropbllic amino acid: 


53.85% 


Percentage of hydrophobic amino acid: 


46.15% 



55 
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Ratio of %hydrophiiic to %bydropbobic: 1.17 
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770RF182 sequence 



10 



15 



29268 




atgttcaatataaaacgaaaaacggaggaagtcaagatgtattac 


1 M 


F 


NIKRKTEEVKMYY 


29313 




gaaataggcgaaatcatacgcaaaaatattcatgttaacggattc 


16 E 


I 


GEI IRXNIHVNGF 


29358 




gattttaagctattcatttcaaaaggrcatatgggcatatcaata 


31 D 


F 


KLFILKGHMG.ISI 


29403 




caagctaaagatatgaacaacgtaccaattaaacatgcttatgtc 


46 Q 


V 


KDMNNVPI KHAYV 


29448 




gtagatgagaatgacttagatatggcatcagacttatttaaccaa 


61 V 


D 


ENDLDMASDLFNQ 


29493 




gcaatagatgaatggattgaagagaacacagacgaacaggacaga 


76 A 


I 


DEWIEENTDEQDR 


29538 




ctaattaacttagtcatgaaatggtag 29564 


91 , L 


I 


N L V M K W * 
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Physico-chemical parameters of ORF 770RF182 

1 HFNIKRKTEE VKMYYEIGEI IRKNIHVNGF DFKLFILKGH MGISIQVKDM NNVPIKHAYV 

61 VDENDLEMAS DLFNQAIDEW IEENTDEQDR LINLVMKW 



Number of amino acids: 98 

Average molecular weight (Daltons): 11691.50 

Mean amino add weight (Daltons): 1 19.30 

Monoisotopic molecular weight (Daltons): 1 1683.84 

Mean amino acid monoisotopic weight (Daltons): 1 19.22 



Amino acid composition 



Aci 
d 


Symbo 


Numb 
er 


% 


Average % 
in Swissprot 


Aci 
d 


Symbo 
1 


Numb 
er 


% 


Average % 
in Swissprot 


Ala 


A 


3 


3.06 
% 


7.58% 


Cys 


C 


0 


0.00% 


1.66% 


Asp 


D 


9 


9.18 

% 


5.28% 


Glu 


E 


9 


9.18% 


6.37% 


Phe 


F 


5 


5.10 

% 


4.09% 


Gly 


G 


4 


4.08% 


6.84% 


His 


H 


3 


3.06 

% 


2.24% 


lie 


I 


12 


12.24 

% 


5.81% 


Lys 


K 


9 


9.18 
% 


5.95% 


Leu 


L 


6 


6.12% 


9.42% 


Met 


M 


6 | 


6.12 
% 


2.37% 


Asn 


N 


9 


9.18% 


4.45% 


Pro 


P 


1 


1.02 

% 


4.9% 


Gin 


Q 


3 


3.06% 


3.97% 


Arg 


R 


3 


3.06 

% 


5.16% 


Ser 


s 


2 


2.04% 


7.12% 


Thr 


T 


2 


2.04 
% 


5.67% 


Val 


V 


7 


7.14% 


6.58% 


Trp 


W 


2 


2.04 
% 


1.23% 


Tyr 


Y 


3 


3.06% 


3.18% 



Number of acidic (negative) amino acids (ED): 


18 




18.37% 


Number of basic (positive) amino acids (KR): 


12 




12.24% 


Total charge (KRED): 


30 




30.61% 


Net charge (KR - ED): 


-6 


6.12% 




Theoritical pi: 


4.76 " 


Total linear charge density: 


0.33 


Average hydrophobicity: 


-3.89 


Ratio of bydrophUicity to hydrophobicity: 


1.28 
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Percentage of hydrophilic amino acid: 5 1 .02% 

Percentage of hydrophobic amino acid: 48.98% 

Ratio of %hydrophilic to %hydrophobic: 1 ,04 
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TableS 



BLASTP 2. 0.6 (Jan-05-1999] 

Query. sid| 100017 | lan |770RFC1 7 Phage 77 ORF | 23269-23982 | -3 
(237 letters) 



393,678 sequences; 120.452,765 total letters 



Score E 

Sequences producing significant alignments: (bits) Value 

gi|44939B6|emb|CAB39045 .l| (AL034559) predicted uBing hexBxon; ... 41 o.oio 

gi|73O607jsp|P23250|RPIl YEAST NEGATIVE RAS PROTEIN REGULATOR P... 38 0.053 

gi|3097044|emb|CAA7529Sr(Y1503 5) KIR {Cowpox viruB] 38 0.090 

gi|2146245|pir| |S73794 hypothetical protein H91_orflB0 - Mycopl . . . 38 0.090 

gi | 83910 |pir| [S04682 riboeomal protein varl - yeast (Candida gl . . . 37 0.15 

gi |13313 5 j Bp | P2 1358 | RMAR_CANGA MITOCHONDRIAL RIBOSOKAL FROTEIN ... 37 0.15 
gi|2128843 |pir| |H64475 hypothetical protein MJ1409 - Methanococ . . . 36 . 0.20 

gi| 5107017 |gbjAAD39926.1|AF1262B5_2 (AF12628S) RNA polymerase r... 36 0.35 

.gi|2146210|pir| |S73342 hypothetical protein E07_orfl66 - Mycopi;.. 35 0.60 



Database: swissprot 

79,449 sequences ,- 



28,874,452 total letters 



Sequences producing significant alignments : 



Score 
(bits) 



sp|P23250 RPI1_YEAST NEGATIVE RAS PROTEIN REGULATOR PROTEIN. 

sp)P2135B RMARCANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

Bp j Q2 14 4 4 LDLC_CAEEL LDLC PROTEIN HOMO LOG. 

6p|P27240 RFAY~ECOLI LIPOPOLYSACCHARIDE CORE BIOSYNTHESIS PROT. 

sp|P53192 YGC0_YEAST HYPOTHETICAL 27.1 KD PROTEIN IN ALK1-CKB1. 

Sp|P32908 SMC1_YEAST CHROMOSOME SEGREGATION PROTEIN SMC1 (DA-B. 

Bp|P54663 TAGB DICDI PRBSTALK- SPECIFIC PROTEIN TAGB PRECURSOR . 

sp|003100 CYAA~DICDI ADENYLATE CYCLASE, AGGREGATION SPECIFIC (. 



Value 



38 


0.014 


37 


0.04 0 


34 


0.35 


33 


0.46 


33 


0.60 


33 


0.60 


32 


0.78 


32 


0.78 
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BLAST? 2.0.8 (Jan-05-1999) 

Quary- sid| 100019 | lan |77ORF019 Phage 77 ORP|39851-40S0l|2 
(216 letters) 

Database: nr 

373, 3S5 sequences; 114,214,446 total lettero 

Score E 

Sequences producing significant alignments: (bits) Vslue 

gi| 334 1966 |dbj|BAA31932| (AB009866) orf 59 (bacteriophage phi PVL] 437 e-122 

gi | 2689911 (AE000792) B. burgdorferi predicted coding region BB... 38 0.058 

*5 gi | 1171589 )emb(CAA64 574 | (X95275) frameshift (Plasmodium falcip . . . 37 0.10 

gi 1 44 93 986 1 embjcAB3904 5.il (AL034559) predicted using hexExon; ... 36 0.23 

gi|l41257| 8 p|P18019|YPI9_CLOPE HYPOTHETICAL 14.5 KD PROTEIN (OR... 36 0.29 

gi|l33412|&p|P270S9|RPOB_ASTLO ONA- DIRECTED RNA POLYMERASE BETA... 35 0.51 

gi 1 3 122231 1 sp 1 0588 51 1 HISX_MBTJA HISTIDINOL DEHYDROGENASE (HDH) ... 35 0.51 

gi|3649757|emb|CAB11106.1| (Z98547) predicted using hexExon; MA... 34 0.66 

gi|2688313 (AS001146) sensory transduction hietidine kinase, pu... 34 0.87 

20 

Database: swissprot 

79,449 sequences; 28,874,4 52 total letters 

, Score E 

25 

Sequences producing significant alignments: (bits) Value 

Bp|P18019 YPI9_CLOPE HYPOTHETICAL 14 . 5 KD PROTEIN (ORF9) . 

Sp|Q58B51 HI SX MET J A HISTIDINOL DEHYDROGENASE (EC 1.1.1.23) (H. 

sp|P27059 RPOB_ASTLO DNA- DIRECTED RNA POLYMERASE BETA CHAIN (B. 

Sp|Q02224 CENB_HUMAN CENTROMERIC PROTEIN E (CENP-E PROTEIN). 

sp|P04931 ARP PLAFA ASPARAGINE-RICH PROTEIN (AG3 19) (ARP) (PRA. , 

30 Sp|P18011 IPAB_SHIFL 62 KD MEMBRANE ANTIGEN. 

Bp|P18709 VTA2~XENLA VITELLOGENIN A2 PRECURSOR (VTG A2) ICONTA. . 

8p|O64409 CP3H_CAVPO CYTOCHROME P450 3A17 (EC 1.14.14.1) (CYPI . . 

8p|P21358 RMAR_CANGA MITOCHONDRIAL RIBOSOMAL PROTEIN VAR1 . 

8p|Q0394S IPAB_SHIDY 62 KD MEMBRANE ANTIGEN. 



26 


0.079 


35 


0.14 


35 


0.14 


34 


0.31 


33 


0.53 


32 


0.69 


32 


0.90 


32 


0.90 


32 . 


0.90 


32 


1.2 



35 



40 



45 



50 
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BWLSTP 2.0.8 tJan-05-1999] 



Query- sid) 100043 | lan | 77ORF04 3 Phage 77 0RF| 29304 -29564 | 3 
(86 leccers) 



10 



373,355 sequences; 114.214,446 total letters 



15 



20 



Sequences producing significant alignments: 



Score 
(bits) 



E 

Value 



gi|3341947|dbj|BAA31913| (AB009866) orf 39 (bacteriophage phi PVL) 182 6e-46 

gi|744518|prf | |2014422A FKBP-rapainyc in- associated protein (Homo... 32 0.84 

gi 1 1169736 | spj P42346 | FRAP_RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN. . . 32 0.84 

gi | 1169735 | 8p| P42345 | FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTE. . . 32 0.84 

gi|32B2239 IU88966) rapamycin associated protein PRAP2 (Homo sa... 32 0.64 

gi |3875402 | emb | CAA98122f (Z73906) cDNA EST EMBL:D64544 comes fr. . . 31 2.5 

gi j 1084792 jpirj |S54091 hypothetical protein YPR070w - yeast (Sa. . . 30 4.2 

Database t ewieoprot 

79,449 sequences; 28,874,452 total letters 



25 



30 



Sequences producing significant alignments: 



Sp P4234 5 



P42346 
P34 554 
024118 



Sp| P80034 
Sp P22922 
Bp Q44363 
F38255 
sp P55B22 
sp 058462 
sp P34252 



FRAP_HUMAN FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 
FRAP~RAT FKBP-RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 
YNP1_CAEEL HYPOTHETICAL 4 2.2 KD PROTEIN T05G5 . 1 IN CJ 
LIO__DROME LINOTTE PROTEIN. 

ACH2 BOMMO ANTICHYKOTRYPSIN II (ACHY- II) . 

A1AT~B0MK0 ANTITRYPSIN PRECURSOR (AT) . 

TRAA_AGRT6 CONJUGAL TRANSFER PROTEIN TRAA. 

YBU5 YEAST HYPOTHETICAL 51.3 KD PROTEIN IN PH05-VPS1 . 

SH3B~ HUMAN SH3BOR PROTEIN (21-GLUTAMIC ACID-RICH PRO. 

YA82JKETJA HYPOTHETICAL PROTEIN MJ1082. 

YKK8_ YEAST HYPOTHETICAL 52.3 KD PROTEIN IN HAP4-AAT1. 



Score 


E 


(bits) 


Value 


32 


0.24 


32 


0.24 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


28 


3.5 


27 


6.0 


27 


7.9 


27 


7.9 


27 


7.9 



35 



40 



45 



50 



55 
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BLASTP 2.0.8 Uan-05-1999] 

Query- Bid| 100102 | lan j 77ORF102 Phage 77 ORF| 29051-29212 | 2 
(53 letters) 

Database nr 

3 73.355 sequences; 114.214,44 6 total letters 



Score E 

sequences producing significant alignments: (bits) Value 

gi|3341946|dbj |BAA31912| (AB009866) orf 38 [bacteriophage phi PVL] 96 3e-20 

gi|43252eB|gb|AAD17315| (AF123S93 > voltage -dependent sodium cha . 28 7.1 

gi|2649684 (AE001040) A. fulgidus predicted coding region AF092... 28 9.3 

Database: sviaoprot 

79.449 sequences; 28.874,452 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

sp|P42 067 HUTM BACSU PUTATIVE HISTIDINE PERMEASE. 26 7.1 

Bp|P04775 CIN2JIAT SODIUM CHANNEL PROTEIN, BRAIN II ALPHA SUBU. . . 26 9.2 

Bp|P42619 YQJF ECOLI HYPOTHETICAL 17.2 KD PROTEIN IN EXUR-TDCC. 26 9.2 
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BULSTP 2.0.8 (Jan-OS-1999] 

Ou.ry- sid| 100104 | lan|770m04 Phage 77 ORF| 34393-34551 1 1 
(52 letters) 

Database: nr 

373.355 sequences; 114,214,446 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

gi|2315523 (AF016452) similar to the leucine-rich domains found... 29 4.2 
gi| 4377168 |gb|AAD18990| (AB001666) CT7H hypothetical protein (... 29 5.4 
gij 3882171 jdbj|BAA3444 5 | (AB018268) KIAA0725 protein [Homo sapi . . . 28 9.3 



Database: swiasprot 

79,449 sequences; 28,874,4 52 total letters 



Sequences producing significant alignments: 



sp|P04879 RRPP_VSVIG 
8p|P04880 RRPPVSVIM 
Bp j Ql 3 946 CN7A_HUMAN 
6p|P35381 ATPA DROME 
spjP54659 MVPB~DICDI 
sp|P40397 YHXC_BACSU 



RKA POLYMSRASE ALPHA SUBUNIT (EC 2.7.7.4B. 
RNA POLYMERASE ALPHA SUBUNI7 (EC 2.7.7.48. 
HIGH- AFFINITY CAMP -SPECIFIC 3', 5' -CYCLIC . 
ATP SYNTHASE ALPHA CHAIN, MITOCHONDRIAL P. 
MAJOR VAULT PROTEIN BETA (MVP -BETA) . 
HYPOTHETICAL OXIDOREDUCTASE IN APRS - COMK . 



Score 
(bits) 

27 
27 

26 
26 
26 
26 



E 

Value 

5.4 
5.4 
7.1 
9,3 
9.3 
9.3 
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BLASTP 2.0.8 |Jan-05-1999j 

Query. sid| 12274 8 | lan |770RF182 Phage 77 ORF| 29268-29S64 | 3 
(98 letters) 

Database i nr 

393.678 sequences; 120.4S2.765 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

gi 1 3341947 1 dbj |BAA3l 913.1 1 (AB009866) orf 39 [bacteriophage phi.. 182 Be-46 

gijlOB4792jpir| |S5409l hypothetical protein YPR070w - yeast (Sa. . 35 0.13 

gi ) 1169736 j Sp| P423 46 | FRAP_RAT FKBP-RAPAKYCIN ASSOCIATED PROTEIN.. 32 1.1 

gi |744S18|prf J |2014422A FKBP-raparaycin-associated protein {Homo.. 32 1.1 

gi | 5051381 |emb|CAB44 73 6. 1| (AL049653) dJ647M16.2 {FK506 binding.. 32 1.1 

gi |4826730|ref |NP_004949 . 1 |pFRAPl | FK506 binding protein 12-rap. . 32 l.i 

gi |3282239 (U8B966) rapamycin associated protein FRAP2 (Homo sa.. 32 1.1 

Database: swissprot 

79.909 sequences; 29.054,478 total letters 







Score 


E 


Sequences 


producing significant alignments: 


(bits) 


Value 


sp|P42345 


FRAP HUMAN FKBP - RAPAMYCIN ASSOCIATED PROTEIN (FRAP) . 


32 


0.29 


sp|P42346 


FRAP RAT FKBP -RAPAMYCIN ASSOCIATED PROTEIN (FRAP) (R. 


32 


0.29 


sp|P40557 


YIAS~YEAST PUTATIVE DISULFIDE I SOME RASE YIL00SW PREC. 


29 


3.3 


Sp|024118 


LIO DROME LINOTTE PROTEIN. 


28 


4 .4 


Bp|Q44363 


TRAA AGRT6 CONJUOAL TRANSFER PROTEIN TRAA. 


28 


4.4 


sp|PB0034 


ACH2 BOMMO ANTICHYMOTRYPSIN II (ACHY- II). 


28 


4.4 


ep|P34SS4 


YNP1 CAEEL HYPOTHETICAL 42.2 KD PROTEIN T05G5.1 IN C. 


28 


4.4 


sp|P22922 


AlAT_BOMMO ANTITRYPSIN PRECURSOR (AT) . 


28 


4.4 
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Table 6 

10 



15 



25 



1st 










3rd 


position 




2nd position 




position 


(5' end) 










(3* end) 




u 


C 


A 


G 






Phe 


Ser 


Tyr 


Cys 


U 




Phe 


Ser 


Tyr 


Cys 


C 


u 


Leu 


Ser 


Stop 


Stop 


A 






Se r 


?top • 


Tro 


<3 




Leu 


Pro 


His 


Arg 


U 


c 


Leu 


Pro 


His 


Ang 


C 




Leu 


Pro 


Gin 


Arg 


A 




Leu 


Pro 


Gin 


Aro 


G 




lie 


Thr 


Asn 


Ser 


U 




lie 


Thr 


Asn 


Ser 


C 


A 


lie 


Thr 


Lys 


Arg 


A 




Met 


TTir 


LY5 


Arq 


G 




Val 


Ale 


Asp 


Gly 


U 




Val 


Ala 


Asp 


Gly 


C 


G 


Val 


Ala 


Otu 


Gly 


A 




YW 


Ala 


GI M 


<3lY 


q 
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Tabic 7 



Bacteriophage 3A, complete genome oequence 

1 caaaegctag caacgeggat aaatttttca tgaaaggggg tctttacatg aagttaacaa aaaaacagct 

71 aaaagaatat atagaagatt acaaaaaarc cgatgacaca ttaattaatt cgtatacaga aacatatgaa 

141 ttttattgtc ggttaagaga tgaacttaaa aacagtgact taatgataga gcatacaaac aaggctggtg 

211 cgngcaatat tatcaagaat ccattaagca tagaactgac aaaaacagte caaacactaa ataacttact 

281 caagtccatg ggttcaactg cagcacaaag aaaaaagata gttcaagaag aaggtggatt eggxgactae 

351 caaagtttta aatgaacctt caccaaaact attaacaaca tggcatgcag agcaagtcac tcaagggaaa 

421 ataaaaacaa gcaaatatgt tagaaaagaa tgtgagagac atcctagata tetagaaaat ggaggtaaat 

491 gggtatttga tgaagaatta gcgcaccgtc ctattcgacc tat&gaaaag ttttgtaaac cttccaaagg 

561 atetaaaegt caacttgtat tacagccacg gcaacatttt attatoggca gtttgtttgg ttgggttcat 

631 aaagaaacaa aactgegcag gtttaaagaa gctttgatat ttatggggcg aaaaaatggt aaaacaacca 

701 ctatttctgg ggttgctaac tatgetgtat cacaagatgg agaaaatggt gcagaaattc atttgttagc 

771 aaacgtaacg aaacaagcta ggattctatt tgacgaacct aaggcgatga ttaaagctag cccaaagctc 

841 gacaaaaatt tcagaacact aagagatgaa atccactatg aegcaacgat atcaaaaatt atgccccaag 

911 catcagatag cgataagtta gatggattga atacacacat ggggattttt gatgaaattc atgaatttaa 

981 agactataaa ttgatttcag ttataaaaaa ctcaagagct gcaaggttac aacctcttct catctacatt 

1051 acgacagcag ggtatcaatt agatggtcca cttgttgata tggtagaagc gggaagagac accccagatc 

1121 aaatcataga agacgaaaga actctttatt acttagcatc teeggatgat gacgatgata ttaatgattc 

1191 gecgaactgg ataaaagcaa atcccaacct aggtgtccct ataaatttag atgagatgaa agaagagtgg 

1261 gaaaaageca agagaacacc agetgaaegt ggagatttta taaccaaaag gtttaatacc ttegctaata 

1331 atgacgagac- gagetttatt gattacccaa cactccaaaa aaataatgaa actgtctctt tagaagagct 

1401 ggaaggcaga ccgtgcacga ttggttatga tttatcagaa acagaggact ttacagccgc grgtgecact 

1471 tttgcgtt.Bg ataatggtaa agttgcagtt ttategcatt catggattcc taagcacaaa gttgaacatt 

1541 ctaacgaaaa aataccctat agagaatggg aagaagatgg cttattaaca gtgeaagata agecttatat 

1611 tgactaccaa gatgttttaa attggataat taagatgaat gagcattatg tagtagaaaa aattacttat 

1681 gatagagega aegcattcaa actaaatcaa gagttaaaoa attaegggtt tgaaacggaa gaaacaagac 

1751 aaggagcttt gaccctgagc ectgeattga aggatttaaa agaaatgttt ttagatggga aaataatatt 

1821 taaeaataat ccttcaatga aatggtatat caacaatgtt cagttgaaac tagacagaoa eggaaactgg 

1891 ttgeegtcta agcaaagcag atategtaaa atagatggct ttgeagcatt tttaaacaca tatacagata 

1961 ttatgaataa agttgtttct gatagtggtg aaggaaacat agagtttatc agtattaaag acacaatgeg 

2031 ttaaggaggt gaatgttatc gcaaaagaga atattgtcac aegcataaag aaaaaattga cagacaattg 

2101 gattgatcag tcaactccta agctttatga ctttagccca tggaaaaata gatctttttg gggtgtaatc 

2171 aataataege ttgaaactaa tgaaacgata ttttcagcta ttacaaagtt atctaattcg atggctagtt 

2241 tgcccttgaa aatgtatgaa gattataaag tagttaatac agaagtatct gat t tact ta cagtgtcacc 

2311 gaataattct ctgagcagtt ttgattttat taatcaaact gaaacaatca gaaatgaaaa aggtaatgea 

2381 tatgtgctaa ttgaacgaga catctatcat caaccatcaa agcttttctt attaaatcca gacgttgttg 

2451 aaatgttaat tgaaaaccaa teaegtgaac tttattattc cattcatget gcaactggaa ataaattgat 

2521 tgctcataat atggacatgt tgcattttaa acacategtg gcatctaata tggtgcaagg cactagtccg 

2591 actgatgtgt tgaagaatac aactgatttt gataatgeag taagaacctt taatcttaca gaaatgcaaa 

2661 aacctgattc tttcatgett aaatatggtt ccaatgtagg taaagaaaaa aggcagcaag tgttagaaga 

2731 ttccaaacag cactatgaag aaaacggtgg aatattattc caagagectg gtgttgaaat egaacegtta 

2801 cctaaaaaat atgtctctga agatatagcg geaagegaga atttaacaag agaaagagta getaacgett 

2871 tteaattgee ctcagtattc ttaaatgeaa gatcaaatac aaatttcgcg aaaaatgaag agttaaacag 

2941 attttacttg cagcatacct tattgecaat cgtcaaacag catgaagaag aatctaatcg gaaactactt 

3011 actaaaacag acagagaaaa aaataggtat tttaaattta aegttaaate ttatttaagg gctgatagtg 

3081 caacacaagc agaagtgtac tttaaagcag ttcgtagtgg ttactacact ataaatgaca ttagagagtg 

3151 ggaagattca ccaccagttg aaggtggaga taagcegcta ataageggtg atttataccc aattgacacg 

3221 ccacttgaat taagaaaatc tttgaaaggt ggtgataaaa atgtcaatga aagctaagta ttttcaaatg 

3291 aaaagaaaat caaaaagtaa aggtgaaata tttatttatg gtgatattgt aagtgataaa tggtctgaaa 

3361 gtgatgtaac tgctacagat ttcaaaaata aactagatga actaggagac atcegtgaaa tagaegttea 

3431 tataaattca tctggaggca gtgtatttga agggcatgea atatacaata tgctaaaaat gcatcctgca 

3501 aaaattaata tctatgtcga tgccttagcg gcatcaattg ctagtgttat cgctatgagt ggtgacacta 

3571 tttttatgea caaaaatagt tttttaatga ttcataattc atgggttatg actgtaggta atgeagaaga 

3641 gttaagaaag acageggatt tacttgaaaa aacagatget gttagtaart cagcttattt agataaagca 

3711 aaagatttag ateaagaaca cttaaaacag atgttagacg cagaaacttg gettactgea gaagaagect 

3781 tgtctttegg cttgatagat gaaattttag gagctaatga aataactget agtatctcta aagagcaata 

3851 taagcgtttc gagaaegtec cagaagatct aaagaaagat gtagacaaaa ccactaaaat cgatgatgta 

3921 gatacgtttg aattggttga aacacctaaa gaaagtatgt eaetagaaga aaaagaaaaa agagaaaaaa 

3991 ttaaacgega atgegaaatt ttaaaaatga caatgagtta ttaggaggaa atgaaatgee gacattatat 

4061 gaattaaaac aatccttagg tatgattgga caacaattaa aaaataaaaa tgatgaattg agtcagaaag 

4131 caacagaccc aaatattgat atggaagaca tcaaacaact agaaacagaa aaagcaggct tacaaeaaag 

4 201 atttaacatt gttgaaagac aagtaaaaga cattgaagaa aaagaaaaag cgaaagttaa agacacagga 

4271 gaagctcatc aatctttaaa tgatcatgag aagatggtta aagctaaggc agagttttat cgtcacgcga _ 

4341 ttttaccaaa tgaacttgaa aaaccttcaa tggaggcaca aegtttatta cacgctttac caacaggtaa 

4411 tgattcaggt ggtgataagc ccttaccaaa aacactttct aaagaaattg tttcagaacc atttgetaaa 

4481 aaccaattac gtgaaaaagc tegtctaact aacattaaag gtttagagat tccaagagtt tcatatactt 

4 5 Si tagacgatga tgacttcatt acagatgeag aaacagcaaa agaattaaaa ttaaaaggtg atacagttaa 

4621 attcactact aataaatcca aagtatttgc tgeaatttea gatactgtaa tccatggatc agatgtagat 

4691 ttagtaaact gggttgaaaa cgcactacaa tcaggtctag cagctaaaga aegtaaagat gcctcagcag 
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4161 taagtcctaa atctggatta gatcacatgt cattttacaa tggatctgt:: aaagaagctg agggagcaga 

«831 cacgtatgat gccattatca acgctttagc agattcacat gaagactacc gtgataacgc aacaatttat 

<901 atgcgatatg cggatcatgx caaaatcatc agcgttcttt caaatggaac aacaaatcce tttgacacac 

4971 cagcagaaaa agcatttggc aaaccagtag catttacaga cgcagcagtt aaacctattg tgggagattt 

504; caaccattct ggaatcaact atgatggaac aactcatgac actgataaag atgttaaaaa aggcgaacat 

Sill rtgtttgtat taaccgcatg gtatgatcag caacgtacat tagacagtgc acccagaaet gcaaaagcaa 

5181 aagaaaatac aggctcatta cccagctaag ccccaaaagg ttaatgtaac agetaaggct a oat cage eg 

5251 taataccagc cgaacagggg tgatgaaatg agtttagaag aaattaaatt gtggttgaga attgactaca 

5321 acttcgaaaa cgatctaatt gaaggtctca ctcaatcggc taagtccgaa ttactattaa gtggggttcc 

S391 agattatgac aaagatgacc tggaacaccc gcttttttgt acagegatta gatatatcat tgcaagagat 

5461 tatgaaagtc gtgggtactc aaatgaccaa cctagaagca aggtttttaa egaaaaggga ttgcaaaaaa 

553 1 tgattctgaa accaaaaaag tggtaggtga ttettaaatg gaacttaatg aatttaaaga tcgcgcatac 

5601 tttttecaat atgtaaataa agggecgrat ccagacgaag aggaaaaaac gaagctgxat agttgctttt 

5671 gcaaaatata caatccttct atgaaagata gagaaattet aaaagegact gaatcaaagt caggactaac 

5741 cataattatg aggtceecta aaattgaata cctaccacaa acaaatcact tagttaaaat tgacagaggc 

SB11 ttatattccg ataaattatt caacattaaa gaaataagaa ttgatacacc agatattggc tataacaeag 

SBBi tggttttatc egaaaaatga gcgtagaaat taaagggata cctgaagtgt tgaagaaatt agaateggta 

S951 tacggtaaac aaccaatgea agctaagagc gatagagctt taaatgaagc acctgaattt tttataaagg 

6021 ctttaaagaa agaattcgag agttttaaag ataegggege tagcatagaa gaaatgacta aatctaagee 

6091 ttatacaaaa gtaggaagtc aagaaagagc tgtettaatt gaatgggtag gecctatgaa tegcaaaaae 

6161 attatccact tgaatgaaca eggtcaceca agagaeggaa aaaaacatac accaagaggt ttcggagtta 

6231 ttgeaaaaac attagctget aatgaacgga agtatagaga aatcataaaa aaggagtegg ccagataaat 

6301 gaatatatta aacaccataa aagaaatttt attatctgat gcagagctcc aaacatatat aaatcetaga 

6371 atacactatt ataaagtcac tgaaaatget gaaacttcca aaccttttgt egttattaca cctatttatg 

6441 atttacetce agaccccacg tccgacaaat atcteagtga agaacactta attcaaatag atgeagaatc 

6511 ttcaaataat cagaaaacaa ttgatataac aaaacgaata agatatctgt tatatcaaca aaatttaatt 

6561 eaagcatcta gtcagttaga tgcttatttt gaagaaacta aacgrtatgt gatgtcgaga egttatcaag 

6651 gcataccaaa aaatatatat tataaaaatc agegcatega ataggtgtgc tttttaattt ctaaggagga 

6721 aacaageaat ggcagaagga caaggttctt ataaagcagg tttcaaaaga ttatacgttg gagtttttaa 

6791 cccagaagca acaaaagtag ttaaacgeat gacatgggaa gatgaaaaag gtggtacagt tgatctaaac 

6661 atcacaggtt tagcaccaga tttagtagat atgtttgcat ctaacaaacg tgtttggatg aaaaaacaag 

6931 gtaetaatga agttaagtct gacatgagxa tttttaatat cccaagtgaa gatctaaata cagttattgg 

7001 tcgttccaaa gataaaaatg gtacatcttg ggcaggagag aatacaagag caccatacgr aacagttatt 

7071 ggagaatctg aagatggttt aacaggtcaa ccagtgtacg ttgcgccact taaaggtact tttagctegg 

7141 atccaattga atttaaaaca cgaggagaaa aagcagaagc accagagcca acaaaattaa ctggtgactg 

7211 gatgaacaga aaagtegacg ttgatggtac tccacaaggt attgtacaeg ggtatcatga aggtaaagaa 

7281 ggagaagcag aattcttcaa aaaagtattc gttggataca cggacagtga agatcattca gaggattctg 

7351 caagttcgtt acccagctaa cccccaaaat gttgaagtag cagttaactc aaaatctgea acagtttcag 

7421 cagaataggg gecttcaaaa taaatcaaag gagaataatt tatgactaaa actttaaagg ttcataaagg 

7491 agacgacgcc gtagcttctg aacaaggtga aggcaaagtg tcagtaacrt tacctaattt agaageggat 

7561 acaacttacc caaaaggtac ttaccaagtg gcatgggaag aaaatggtaa agaatceagt aaagctgatg 

7631 tacctcaaCt caaaaccaat ccaattctag tetcaggegt atcatttaca cccgaaacea aatcaatcac 

7701 ggtaaaegct gatgacaatg ttgaaccaaa cattgcacca agtacagcaa cgaataaaac gttgaaatat 

7771 acaagtgaac atccagagtt tgttaetgtt gatgagagaa caggagcaat tcacggtgta gctgagggaa 

7841 cttcagttat cactgctacg tctactgacg gaagcgacaa gtccggacaa aetacagtaa cagtaacaaa 

7911 tggataatta tttgagaege agaatatctg cgtctttttt atttgaataa aaggagctaa tacaatgatt 

7981 aaattegaaa ttaaagaceg taaaacagga aaaaeagaga gecatacaaa agaagatgtg acaatgggcg 

8051 aageagaaaa aegctacgag tatttagaat cagtaaatca agagaataaa aaagaagtac ctaacgcaac 

8121 aaaaatgaga caaaaagagc gacagttatt agtagattta tctaaagatg aaggattgac tgaagaagat 

8191 gttttgaaca agatgagcac caaaacttat acaaaagece tgaaagatat atttcgagaa atcaatggtg 

8261 aagatgaaga agattcagaa actgaaccag aagagatggg aaagacagaa gaacaatctc aataaaagat 

8331 actttatcga acattaagaa aatacaacgt ttctgtatgg agcagtatgg gtggacatta actgaagtca 

8401 gaaaacagee gtatgtaaaa cttttagaaa tacttaatga agagaataaa gaagagactg aagaaaaaca 

8471 aagtgaacaa aaagtcatta caggtaegga tttaagaaaa ctttttggaa gctagaaagg aggttaatat 

8541 gaatgaaaaa gtagaaggca tgaccttgga gctgaaatta gaccaettag gtgtccaaga aggcatgaag 

6611 ggttcaaagc gacaattagg tgttgttaat agtgaaatga aagctaatct gtcatcattt gataagtctg 

8681 aaaaatcaat ggaaaagtat caggegagaa ttaaggggtt aaatgataag cttaaagttc aaaaaaagat 

8751 gtatcctcaa gtagaagatg agcttaaaca agttaacgct aatcatcaaa aagctaaatc tagtgtaaaa 

8821 gatgttgaga aagcatattt aaagecagta gaagetaaca aaaaagaaaa attagctctt gacaaatcta 

8B91 aagaagcett aaaatcttcg aaeacagaac ttaaaaaagc tgaaaatcaa tataaaegta caaatcaacg 

B961 taaacaagat gcatatcaaa aacttaaaca gttgagagat gcagaacaaa agcttaagaa tagtaaccaa 

9031 gctaccactg cacaactaaa aagagcaagt gaegcagtae agaagcagtc cgctaagcat aaageactcg 

9101 ttgaacaata taaacaagaa ggcaatcaag ttcaaaaact aaaagtacaa aatgacaatc tttcaaaatc 

9171 aaaogaaaaa atagaaaaet ettaegctaa aactaatact aaattaaagc aaacagaaaa agaatttaat 

9241 gatttaaata atactattaa gaatcatagc gctaatgtcg caaaagcega aacagctgtt aacaaagaaa 

9311 aagctgcttt aaataattta gagcgttcaa tagataaagc ttcatccgaa atgaagaccfc ttaacaaaga 

9381 acaaatgata gctcaaagtc attteggcaa acttgetagt caageggatg tcatgtcaaa gaaacttagt 

9451 tctattggag ataaaatgac ttccctagga cgxacgatga egaegggegt atctacaccg attactttag 

9521 ggttaggcgc agcattaaaa acaagtgcag acttcgaagg gcaaatgtct cgagttggag egattgeaca 

9591 agcaagcagt aaagacttaa aaagcatgtc taatcaagcg gttgacttag gegctaaaac aagtaaaagt 

9661 gctaaogaag ttgetaaagg tatggaagaa ttggcagctt taggctttaa tgccaaacaa acaatggagg 

9731 etatgeeggg tgttatcagt gcagcagaag caagcggtgc agaaatggct acaactgeaa ctgtaatggc 

9801 atcagcaatt aattcttteg gtttaaaagc atexgatgea aaccatgttg ctgatttact tgcgogatco 

9872 gctaatgata grtgetgeaga tattcaatac atgggagacg cattaaaata cgcaggtact ccagcaaaag 

9941 catcaggagt ttcaatagag gacacttctg cagcaattga agttttatct aactcagggt tagaggggtc 

10011 tcaagcaggt actgeattaa gagcttegtt tottaggcta gccaatccaa gtaaaagtac agctaaggaa 
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10081 atguuut toggtattca cttgtctgac gctaaaggtc aatctgttgg catgggtgaa ctgattagac 

101S1 agttccaoga caacatgaaa ggcatgacga gagaacaaaa actagcaaca gtggctacaa tagtcggcac 

10221 tgaagcagca agtggatttt cagcctcgat cgaagcgggc ccagataaaa ttaatagcto tagcaaatco 

10291 ttgaagaacc ctoatggtga aagtaaaaaa gcagctgatt tgatgaaaga caaccccaaa ggcgctctgg 

10361 oacaattagg tggcgctctt gaaccgttag caaccgaagt tggtaaagat ttaacgccta tgattagagc 

10431 aggtgcggaa ggattaacaa aactagtcga tggatttaca catctccctg gttggtttag aaaggcttcg 

10501 gtaggtttag cgatttttgg tgcatctacc ggccctgccg ttcttgctgg cggcctatca acacgtgcag 

105H ttggaagcgc ggctaaaggc tacgcatcac taaatagacg cattgccgaa aatacaatac tgcctoatac 

10641 caattcaaaa gcaacgaaar ccctaggtct tcaaacccca tcccttggrt ctacaacagg ■■■aacgtca 

10711 aaaggcttta aaggattagc cggagetatg ttgtctaatz caaaacctat aaatgttttg aaaaattctg 

10701 caaagccagc aattccaccg cccaaacctt tgaaaaacgg tttaggacta gccgcaaaat ccttatttgc 

10851 agtaagtgga ggcgcaagac ttgctggtgt agccttaaag tttttaocag gacctatagg tgccacaata 

10931 actgctatta caattgcata taaagttttc aaaaccgcat atgatcgtgt ggaatggttc agaaacggta 

10991 ttaacggttt aggagaaact ataaagtttt ttggtggcaa aattattggc ggtgctgtta ggaagctagg 

11061 agagtttaaa aattatcttg gaagtatagg caaaagctcc aaagaaaagt tcccaaagga catgaaagat 

11131 ggttataaat ctttgagtga cgatgacctt ctgaaagtag gagtcaacaa gtttaaagga tttatgoaaa 

11201 ccatgggcac agcttctaaa aaagcatctg atactgtaaa agrgttgggg aaaggtgttt caaaagaaac 

11271 agaaaaagct ttagaaaaat acgtacacta ttctgaagag aacaacagaa tcatggaaaa agtacgttta 

11341 aactcgggtc aaacaacaga agacaaagca aaaaaacctt tgaaaattga agcggattta tctaacaacc 

11411 ttatagctga aatagaaaaa agaaacaaaa aggaacccga aaaaactcaa gaacttattg ataagcatag 

11481 tgcgttcgat gaacaagaaa agcaaaacat tttaactaga actaaagaaa aaaacgactt gcgaattaaa 

11551 aaagagcaag aactcaatca gaaaatcaaa gaattgaaag aaaaagcttt aagtgatggt cagatttcag 

11621 aaaatgaaag aaaagaaatt gaaaagcttg aaaatcaaag acgtgacatc actgttaaag aattgagtaa 

11691 gaccgaaaaa gagcaagagc gtattttagc aagaatgcaa agaaacagaa atgcctattc aacagacgaa 

11761 gcgagcaaog caatcaaaga agcagaaaaa gcaagaaaag caagaaaaaa agaagcggac aagcaatatg 

11831 aagatgatgt cattgctata aaa&ataacg tcaacctttc taagtctgaa aaagataaat Cactagctac 

11901 tgctgaccaa agacataagg atgaagtaag aaaggcaaaa Cctaaaaaag acgccgtagc agacgtcgtt 

11971 oaaaagcaaa ataaagatat tgataaagag atggatttat ccagtggtcg tgtatataaa aatactgaaa 

12041 agtggtggaa tggccttaaa agtcggtggt ctaacttcag agaagaccaa aagaagaaaa gtgataagta 

12111 cgetaaagaa caagaagaaa cagccogtag aaacagagaa aatataaaga aatggttcgg aaatgcctgg 

12181 gacggcgtaa aaactaaaac tggegaagct tttagtaaaa cgggcagaaa tgctaatcat tctggcggcg 

12251 aaatgaaaaa aatgtggagt ggaatcaaag gaattccaag caaattaagt tcaggttgga gctcagccaa 

12321 aagttctgta ggatatcaca ctaaggctat agctaatagt actggtaaat ggtttggaaa agcttggcaa 

12391 tctgttaaat cgactacagg aagtatttac aatcaaacta agcaaaagta ttcagatgcc tcagataaag 

12461 cttgggcgca ttcaaaatct actcggaaag ggacaccaaa acggtttagc aatgcatata aaagtgcaaa 

12S31 gggctggcta acggatatgg ctaataaacc ccgctcgaaa tgggacaaca cctctagtac agcatggtcg 

12601 aacgcaaaac ccgtttggaa aggaaeatcg aaatggttta gtaactcata caaatcttta aaaggttgga 

12671 ctggagatat gtattcaaga gcccacgatc gttttgacgc aatttcaagt ccggcatggt ctaacgctaa 

12741 atcagtattt aatggtttta gaaaatggct atcaagaaca tacgaatgga tcagagatac tggtaaagac 

12811 acgggaagag ctgcggctga tttaggtaaa aatgttgcta ataaagctat tggcggttta aacagcatga 

12881 ttggcggtat taataaaata tctaaagcca ttactgataa aaatctcatc aagccaatac ctacattgtc 

12951 tactggtact ttagcaggaa agggtgtagc taccgataat tcgggagcat taacgcaacc gacatttgct 

13021 gtattaaatg atagaggttc tggaaacgcc ccaggcggtg gagttcaaga agtaattcac agggctgacg 

13091 gaacatteea cgcaccccaa ggacgagatg tggttgttcc actaggagtt ggagatagtg taataaatgc 

13161 caatgacact ctgaagttac agcggatggg cgtcttgcca aaaccccacg gtggtacgaa aaagaaagat 

13231 tggctagacc aacttaaagg taatataggt aaaaaagcag gagaacccgg agctacagct aaaaaoacag 

13301 cgcataatat caaaaaaggt gcagaagaaa tggctgaagc agcaggcgac aaaatcaaag atggtgcatc 

13371 ttggttaggc gataaaateg gcgatgtgtg ggattacgta caacatccag ggaaactagt aaataaagta 

13441 acgccaggtc taaatattaa ttttggaggc ggactaacgc tacagtaaaa attgctaaag gcgcgtactc 

13511 atcgctcaaa aagaaattaa cagacaaagc aaaaccgtgg cctgaagatt ttggtggcgg aggcgatgga 

13581 agctatccat ttgaatatcc aatctggcaa agatttggac gctaeacagg tggacttaac tccaatgacg 

136 51 gtcgtcacta tggtatagac tttggtatgc ctactggaac aaacgtttat gccgttaaag gtggtatagc 

13721 agataaggta tggactgatt acggtggcgg taattctata caaattaaga ccggtgctaa cgaatggaac 

13791 tggtatatgc atttatctaa gcaattagca agacaaggcc aacgtattaa agctggtcaa ctgataggga 

13861 aatcaggtgc tacaggtaat ttcgttagag gagcacacct acatttccaa ttgatgcaag ggtcacatcc 

13931 agggaatgat acagctaaag atccagaaaa atggttgaag ccacteaaag gtageggegr tcgaagtggt 

14001 tcaggtgtta acaaggctgc acctgcttgg gcaggcgata tacgtcgtgc agcaaaacga atgggtgtta 

14071 atgttacctc gggtgatgta ggaaatatca ttagcttgat tcaacacgaa tcaggaggaa acgcaggtat 

14141 aactcaaect agttcgccta gagacaccaa cgctttacag ggoaatccag caaaaggact gccceaatat 

14211 atcccacaaa catttagaca ttatgccgtc agaggtcaca acaatataca tagtggccac gaccagttat 

14281 tagogttctc caecaacaga tattggcgcc cacagtttaa cccaagaggt ggccggcctc caagcggtcc 

14351 aagaagatac gogaatggtg gtttgactac aaagcatcaa cttgctgaag cgggtgaagg agacaaacag 

14421 gagatggtta tccctttaac tagacgtaaa cgagcaattc aattaactga acaggttatg cgcatcatcg 

14491 gcacggacgg caagccaaac aacaccaccg caaacaatga cacttctaca gttgaaaaat cgccgaaaca 

14561 aatcgtcatg tcaagcgata aaggaaacaa actaacagat gcattgattc aaactgtttc tcetcaggat 

14631 aataacttag gttctaatga tgcaattaga ggtctagaaa aaatatrgtc aaaacaaagt gggcatagag 

14701 caaatgcaaa taattatatg ggaggttcga ccaattaatg caatcttttg taaaaatcat agacggtcac 

14771 aaggaagaag taataacaga tcctaatcag cctatatttt tagatgcaag ggctgaaagr ccaaacacca 

14841 atgataacag tgtaaccatt aacggagtag acggtatctt accgggcgca attagttttg cgccttttcc 

14911 attagtacta aggtttgget otgatggtat agatgttata gattcaaatt catttgagca ttggttcaga - 

14981 tctgtgttta atcgcagaca cccccaccat gtcattacct ctcaaacgcc cggtgttaaa tacgcagtga 

150S1 acacagccaa tgtcacatct aatttaaaag atggttcttc aaccgaaatt gaagtaagtt taaatgetca 

15121 taaagggtat tctgaaccag ttaactggac cgatagcgag ttcttattcg accctaattg gacgtttgaa 

15191 aatggaactc ctcttgatct caeaeetaaa tatactcata catcaaacca act tact att tggaacsggtt 

15261 ctactgatac gataaatcca cgattcaagc acgatttgaa aatattaatt eatttaaatg cgagtggagg 

15331 atttgaactg gttaaccata caacaggtga tatttttaag tacaacaaaa gtatagataa aaacactgat 
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tttgttttag 
taacattagc 
gttccctttc 
aattgacttg 
taggtacatc 
aacttcataa 
tttttattga 
tgatgacgac 
gcaaatcaaa 
acgaattagg 
cccaaatgat 
catcaacata 



aaatggttct 
tgcaagtatg 
ttttagacgg 



atggtgtgta 
gccaggtaaa 
arttataggt 
cgaaaattta 
acttttacag 
trtatcatgg 
agttacggca 
agtagcgaaa 
aaacctcggt 
taacaaaaac 
acggagatat 
ataccgatac 
gcacacagct 
ataaaagaag 



■ttgatatat 
atttaattge 
gcagtttggg 
tttgcaaaaa 



aettgacagg 
gtaeaa&tcc 
gtataaatca 
ataatggcaa 
actggaaagc 
catttcggta 
accgcacaga 
tgctacctta 
aggtcgagat 
acagagttta 
gatcgeaagc 
eagtgctaac 
ttagacgatt 
aagaaattct 
acctcttaca 
gctagaaata 
caggcggggg 
cttaaatggt 
aagtatggta 
aagaaaataa 
tatagag&ta 
acgagaccaa 
caggagacag 
tcaggttaat 
caaatatact 
atagaacaca 
ttttatcatg 
gggatgttaa 
taccaagaga 
ccagatattg 
ggagacactt 
caaaaatgtc 
agattttcca 
tttgttcaca 
aaacaagttc 
gtagctataa 
gagagagtga 



ccatttaaag 
aagtttcggt 
aaaaagtttg 
aaaacCatta 
ttgaaaaagc 
gcaacccgct 



gtggttctac 
atttgatttt 
aatcatccaa 
tgacctatog 
gtcagaatta 
atggccgaaa 



ttatgggact 
aaccccccaa 



caagcaaatc 
attgataaag 
ctccaaataa 
tgacaactca 
attcgatacg 
agcaaataaa 
agatagcgta 
tcacatccat 
aacaagcgct 
tccatacact 
cagaagaagt 
cataaatggg 
gacgacaaaa 
aacccccaaa 
ccagaaagga 
tatagagcta 
agttgcatca 
agcagtaaac 
gagattgttc 
gggaaaccca 
tgatatcgag 
aacgcagcaa 
atggctatat 
ccatggtaca 
aataatgaga 
tgcaagatgt 
aatcctatac 
agaagtttag 
caaacgaaac 
taatccaaat 
gcrgactatg 
atgacaaaga 
tcgraetttc 
agtgacacag 
cagagccagg 
atggcgrgat 
acgcgtaata 
cggactggaa 
agatattaac 
actgaacgta 
ggceagttcg 
agggccatgg 
atctagaaaa 
ttcctctgtt 
gcacgaatag 
gtatettaaa 
tgcagaatta 
tttagcatga 
tggatcgtgc 
t aaagaaaaa 
aatagccata 
atacggatga 
ttcaaattgg 
aataatccag 
gaggtgctag 
accgaacgge 
gctattaatg 
gtaacgctaa 
cggtgtgggc 
tttccaggtg 
accccccaga 



tgcaratcga 
aatgaatcca 
aggegattta 
ctagatgtag 
tttatagaac 
cgaaaaatac 
taccacacaa 
ctggtaaaae 
caaaatgacc 
ggctcagaat 
gtttttattc 
cgtatctgca 
gaggaaaaga 
gtactcaccg 
agttagatcc 
aagcaaatct 
gcaagcacgt 
aaaagctaag 
ggccgcaacc 
ctaacacgca 
tgatactcce 
ttctctgttc 
ttgtaaacgc 
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gacataaaca 
agactaaagg 
atggatcate 
accatggttc 
cacccataac 
acaattaagc 
cgtatgaatt 
gccagaatac 
tataaaataa 
actgtaaaga 
tcctgaaaca 
actgtcagta 
aaaattataa 
taccgcaaca 



cttgttttgc 
cttagaaacg 
ccttgtatgt 



agtactttga 



ataccttacg 
ataaattaga 
cggcaaactg 
aggegcatta 
aggaaacaca 
gaaagctaaa 
aaacctttac 
aagcccgagt 
aacccaaacc 
tatcgttttg 
tgcaatcctt 
gctaagtcgt 
cataacggtt 
atacattagt 
atttacagga 
agaactgaga 
acgatgttga 
ccaaccaacg 
aatagaaact 
gtggaacacc 
tagtggtaaa 
atgactgggc 
gtggcagagt 
tcteeactat 
gcaggtcggt 
gttatgcaag 
ttatgtgcct 
atagtaggca 
aaggggtage 
taatagtgct 
actttacacg 
cacagcgcat 
cttcttLtta 
cgattcgagg 
atttcaatta 
ggtaattcag 
ccccatccga 
gaaagcaatg 
ggtctatcag 
tatctgatCC 
gaaacatgaa 
ta 
it 

eactaacggt 
acaaacaaag 
attacaatac 
attatacgca 
cctacattgt 
gtgactctac 
tggtgatgga 



aagactgaac 
atgaattaat 
aatagatgaa 
gtcattgcac 
atcctccttt 
gcgccacccg 
caggcagtag 
taatttaaaa 
cggaaaataa 
tggaaaatct 
aaacttacaa 
acaatccaac 
aaagtcaaat 
taatgtaaac 
aataaagagg 
aacctgataa 
ctggtttgat 
ctaagaccta 
atagatatat 
tcgtctcaag 
cacccagaaa 
gacctagaag 
taaaaatatt 
cagggtgtga 
acttaacggc 
agactcattt 
aaagctttga 
aaagaggtat 
taaacctcta 
tcatacactg 
tcttggaagt 
gaatatgatg 
aaaaacagtg 
tgtcgtttta 
cggttggaac 
acagcatccg 
aagggagaat 
tatgaaaata 
ataccactaa 
caaaggagtc 
cccaatgatg 
acgtggtagt 
aacaaaatta 



9*gcgggaat 
agacaccagt 
atgaccatct 
Ctttaaagaa 
agttctgtct 
agacagcgcc 
tcaaaatcac 
tcttcagatg 
ttggaaattc 
agcggtagac 
ttttaccaaa 
catcggaatt 
tcctattaga 
accgggccca 
agggcrctca 
taagtcggtt 
atattcttag 
atgtcggaac 
aactgxtgae 
gatatcgaaa 
tagatgttaa 
gggatataac 
gtgtcxttca 
aagacaacag 



tgacacaaac 
gatactaaaa 
atcagcaatg 
tattatgaac 
tegatttact 
aaaggttgaa 
tcagtggaaC 
agtaettaag 
taagcgaaaa 
ctaeccggcc 
gaagcgagaa 
aagaacagct 
acaactgaca 
aagctactat 
aggtggaata 
cagtctgaaa 



agaggcatta 
ccacacctaa 
gaccccaatg 
tgaatgaagc 
aattegtgaa 
ggtgataaag 
caaataagct 
atatggattt 
gtaccgattg 
gtataacrta 
agcgattcga 
ataaaagttt 
tcaaatattc 
taactccgat 
tataagttga 
caocagattt 



tgaaaaacca 
tacgtcgcag 



acattcaaat 



aaaacgataa 
tgataccgat 
ataggcgaag 
cagttgacga 
gtatctaaca 
agaatagagc 
acttcgctga 
taacgcactt 
ggcgaaacct 
aaactataag 
gcaagaattt 
aatagaacag 
acggtcaatt 
tgatgatgag 
cacacgccta 
aaccctacat 
tcactgggaa 
gataaagtct 
cttttgacga 
tttcgattta 
cctggcgaat 
tgctaggtgt 
tttagaaaca 
ccaatgaggc 
accatacagc 
caagccacca 
acttttgaaa 
gtaaacggga 
cttaaccacg 
ttatatgtag 
ctgagacact 
tacaagrtaa 
ettcaaatct 
aaataatcaa 
atggtagtcg 
taattaaacg 
tgtcgagaga 
caeca tactg 
taaaaaaegg 
agcaaaaccc 



gcctccaagc 
agate act a a 
aggtgatcca 
accgccgaac 
cgtttgttaa 
acctcttgaa 
gatgacaagt 
acttaaatga 
agagectgge 
aatggtggag 



agtcagtgaa 
agatgaeggt 
acccaaccca 
atttagtagt 



accgttcaaa 
tcaataaaag 
gagttcagac 
agtccetctg 
gcgttcacga 



ctacataggt 
actgaattaa 
geaacgaaac 
acaeaaetac 
gacataggga 
acagtaccct 
ggattgcccc 
caggaattat 
tagegggaaa 
aaggatcegg 
aatttgegct 
gcaaatctac 
gttgtggcac 
ttgacacatt 
agaatcaaat 
ttacttgega 
gcacagtaca 
tatagacagc 
ctatggactt 
atgtggaaat 
cacccctgtc 
cttgaaaact 
tgcacaaaat 
aaaatacttg 
gaaacaggag 
ttgeggaage 
tactgtcggt 
cctcactcaa 
ccgataaacc 
ccaaaccgac 
caaaccggcg 
gggtgctttc 
gagagtacct 
gatgacacaa 
aagettcaaa 
actgaaaaat 
tgagtaatct 
agacataact 
ccgttattat 
ctccactaga 
agatggaagt 
actatcacat 
ccgaatccca 
cgaagatcat 
tcaagtacag 
attcaagaac 
cagtggtggr 
aagataaege 
cttatgtctc 
aaottcagat 
gcgggttcat 
gcgcccaatc 
gtattcggtt 
caatttattt 
aaoaaoaaaa 
gtttggacca 



acagtcttaa 
acagegcaaa 
gccgttagaa 



aggttgctaa 

aaaggacatg 

caagcaaatc 

gtgtattaat 

ccagctacac 

aagaeggtat 

tcaaaatagc 

cteggtacag 

gagectcago 

tggcttgaca 

tataatggca 

caegcaataa 

aaaagaacat 

aagaccaaac 

cagaacttgc 

catgacacaa 

ccatcgactg 

acagctccat 

tagecatgge 

ataaatgaaa 

caatgaatta 

cagcatccct 

tateggtata 

aagaagcgta 

agaaggtttg 

ggtgatggaa 

gaggegttec 

taagaacctt 

gatcccccat 

gtgacgcaat 

tggaagaact 

ccatccatca 

aacgctctac 

cacaggtggc 

catgatagta 

agagaaatct 

cccagaacag 

cgagtgaaga 

aatattagac 

tatcaagctc 

ttaacgttga 

ggaattagaa 

gcgagcctga 

atgaattaaa 

actcgatgag 

ctagtcacaa 

agaccaccgg 

gcaagccoca 

tacaagegta - 

ggtctgagtg 

aaaagctaat 

attettgatg 

tattaatttt 

gatttcattg 
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gtcgtacatc 
cgceaataga 
atgagataat 
tttctcccaa 
t cagaagaaa 
tcttacgaaa 
taagcaaaat 
actgaaaatg 
agtaaagaag 
gagaaaagta 
tggtgcaatg 
aagactatgt 
tagatagaaa 
aaatatcaga 
ttgttaaaaa 
ttggtcgtgt 
tctggaaaaa 
gtaaatcaat 
tacttaccgt 
tcaaaagcta 
gcaatgacac 
aaccaagcag 
agcgtcacga 
caatattcca 
tcaccgcaaa 
ttgagagcgc 
cgttgcgcaa 
tttattagac 
ccgccaaeaa 
tggagcagta 
tatttaagac 
atacagcata 
tgacattgtt 
agtcaattca 
gaggtgtaac 
atccgaatta 
aaactaatag 
ctaaaaacga 
agaaactggr 
agaattactg 
gatggactac 
taatagaata 
cttcggtact 
aaacaaacgt 
ccaactatat 
acgggttttt 
aaatacttaa 
acgattttta 
cttcgtttca 
aacagaagat 
aagtataaca 
acaaaaacac 
ccecaatgta 
ggttataaca 
aaacaattag 
caggagtttt 
tatcaatact 
tttaataacg 
ctctatctac 
caacacaacc 



tccaaccata 
gggacaatta 
tccat&cgct 
gttcttagac 
aagatgacct 
aatggttgct 
gcactaacgg 
ettoaactaa 
acncagcgtg 
tccagaaaac 
tttggattta 
ctgaaaaatt 
cttcgaagaa 
gacatcaaga 
ctatttttgg 
ttctggttta 
aggagcaaac 
-cttagcgaa 
tgtcgcttta 



ccaaacgatg 
acaaaattac 
accactggtg 
ctaaagcctt 
gcatcaacag 
agtatgcaga 
caaaacaact 
ctccaccaac 
gtatgtagat 
ctagagtcar 
ccaaacgaca 
cgacagaaca 
ctaaggcgtg 
cgtggaccct 
cattcaaagg 
gtaagtgtaa 
aaacggacgc 
caaaggcatc 
tataccacgt 
aagccgaaaa 
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tctatttcga 
aggagtgaga 
gctccgaaga 
taaatatcca 
attgacagtg 
aacaagtcgt 
tgrgacactc 
attcgaagat 
atggaagtta 
aggttataat 
cgaacaagac 
gaagacagtc 
acaaagaaga 
aggatcaata 
aggtgatcac 
gtaatagtta 
aaaagtaata 
agcccgattc 



cttaggaagt 
aaataatgca 
aggtactgat 
aatggggaaa 
aagaacaaaa 
ccaaagraca 
aataaaaaac 
attaaaacat 
tagataaaga 
cttatggctt 
tggcgttcaa 
tgagaacgca 
agatgaaaaa 
gggacgattc 
cacgcccaag 
agagtcagtg 
acaagataca 
cagtagacga 
tccaacatct 



ctacgaatat 
aaaaatggtt 
ctacgcaaat 
ettgataata 
agttggacat 



tgataattea 
acgtctctta 



aaacctacaa 
tttaccgtca 
ctatatttaa 
tggcgcccgg 
cgccataaaa 
gaatcttctt 



agtttccttt 
gtccetcata 
ccaaacgaag 
catctacacg 
ttctctctta 
caaaaaccag 
atecccctcc 
atctatagaa 



cctggttggg 
caaatttccc 
gcaagcagca 
ggaaacggaa 
atgccggtca 
cggtgtcaat 
ccagaaacac 
atgcagacac 
acctegtaac 
ggtcttatca 
ccggcgcgat 
aaagaatccg 
tattacacag 
gtgtaccacc 
ccatatcgct 
agcagtttcg 
tgcccatcat 
tcctagtaca 
cgtggtttta 
cccgaaataa 
Ctctactaaa 
tggtcaaaaa 
cgaatctaaa 
acaagtagcg 
aagacgcttt 
agatcatata 
ganttaataa 
taggtggtaa 
ctataataaa 
aggtcacata 
cccgcactat 
ctgatttaaa 
ctttgttcta 
agaaagttta 
aaccagtact 
Ctcagctggt 
aacgatgaat 
cgaaegaata 
cttttcaaaa 
ctctcaccac 
tggttaactt 
acctttaggt 
tctattttge 
ctaaagcgcc 
cgactttgat 
cctgataaga 
ataaaagcgt 
catttgatgg 
catgccagca 
gt.gacct.tat 



tgtcgttttc 
aetttcacat 
gtcccgaaac 



caagtataga 
aacgatttag 
ttagggaagc 
tgacagcaac 
tgaaaaatac 



ggtaggtgtt 
agtteaatee 



attaaaccta 
caaacgaacg 
tgaagtcgca 
gtaggtaaca 
atttagacgc 
tattgataaa 
gacttactaa 
ctaataaaaa 
tcatggxaag 
ccagtgccag 
ttgccaatgc 
caataacgca 
aatagtggac 
gtaagtttag 
ttaaaattaa 
taaattattc 
tgtttattat 
tagtaaaaaa 
agttaaaaag 
aagactatca 
gctgataaca 
acaagtgggg 
gattttaaaa 
aaagcaacga 
attatctacc 
ttttaatagc 
ataaaaagta 
ccatataaat 
egattttttc 
aaatttgaat 
acagtgtttg 
tgaaattgga 
tatagttata 
agcgatcccg 
cgiacactaa 
ccaacctcct 
cttctgttca 
cactcaacgt 
atctccacct 
aactcataag 
aattagttat 
acacaggcgc 
atcaccatac 
cttaetccat 
atgtcccctg 
cgtctcgtct 
ccagctgcac 
tctgctcccc 



cgttfcggcca 
cgttacaaga 
agtgttggag 
aaaaaaccat 
cgattttata 
ttatatggtg 
aaaaagatta 
agcaggagaa 
agtatacaag 
atgttaacgt 
tgatatggat 
cctatcg gt g 
caggttatac 
caaaggtaat 
acaatcaaat 
aacgtcgtta 
tgcagtctga 
taaacagtta 
tgtgttcgta 
caatcaaaat 
acacatttgt 
gtttaatata 
gccgcaacat 
atattgagaa 
ggccacacaa 
atgcaaggtt 
ggtggccttt 
caaaaataaa 
ggtccatcaa 
ggtgataaga 
atcta&caat 
tattgaaaca 
ttaattcagt 
tatttaatcc 
ttgtataagc 
aaagactttg 
atttactaga 
acacacactt 
ttcttagctt 
ttgggttact 
etacacttgt 
attttttgtg 
tgaatggttg 
tttcattata 
cgtcaaccac 
ttcggattca 
ctaatacaac 
ctttaacaca 
tctctaaaaa 
cacatgcaat 
caattgttca 



gggcaaataa 
atggtggcgg 
aaattggaat 
catgtecact 
acaaagccaa 
gcccgcagcc 
cgtaaatata 
gcccaagtca 
cggcctatat 
agcgcaagtg 
atgtcactaa 
atcagcagaa 
tggatcaaga 
gtgtgatatc 
acccgataaa 
aacgtaaggg 
atgacggcgc 
tattgctaca 
taattgraca 
aetttcacat 
ctgtgtgcta 
acaaattatt 
agatatttta 
aaaacgtaat 
tgtcgttagg 
cattggtgat 
aatattcagt 
etatcaattc 
ccaatacaat 
atagattcag 
caggaggtaa 
tgactcaatt 
gataggtcca 
cgcttgatat 
tgcttacagg 
cttaattgtt 
ctattagaca 
aaaacagaat 
agttgagcga 
gtagacctta 
ttcctgataa 
acgagtagct 
aggegtttet 
aaataaatcc 
attaccacta 
aactcccttt 
aatacaactt 
gagataccaa 
gagtgeaatt 
ggttccattg 
atacttcttc 
atacgatact 
ttcgcatagr 



caaagaggct 
aatattagct 
actgaaaatt 
cagtttttaa 
cacagtcgcc 
aagttatcga 
tagaagaggt 
ggtatcaatt 
ggaatacgea 
cttaattcga 
cgegattaga 
agaaaaaatc 

aacaaagaga 
taagtacatt 
ggaattttag 
cttcggcact 
tegtattgat 
tgagaatata 
caagaaggta 
ggcaagcgcc 
gaccaatgtt 
tgacccgctt 
ttacaaggtt 
ttaaaaacta 
ageeggacat 
ggtaaaggtt 
attacgatga 
aagcgttact 
ggccatggtt 
taacgccaaa 
atcacaagac 
tgggttaaat 
gtgggcatgt 
aaacaactta 
ataaatataa 
aaaactatga 
tagtgaggct 
aataaegtae 
aeggctatte 
atactgtatc 
ggagaggtag 
tgatgaacct 
gaatatatta 
tgaccaaaaa 
tataacttgt 
aactoggtaa 
aaaatttaca 
aataatcact 
ggegctgagg 
ttgattttgt 
aaagactact 
attggectea 
taaatgttag 
tggttcattt 
tetaggggeg 
gtcagtgaaa 
ggttccatag 
tccaaaagca 
aaatttatta 
aaagagacaa 
cattgaagag 
caagacttca 
aacaacaaat 
agtgccttct 
tetegctttt 
ttacttagta 
caagtattta 
gttaaaactt 
caaacactgc 
cgeceattae 
actaacatag 
gtaccatctt 
aatcaccatt 
aegcaatatg 
agcttagact 
taagtacgtt 



ctggtgcgaa 
aacaagegta 
taceagaaaa 
cgaagattat 
ectgacgaca 
tgcaagttaa 
taaaggagag 
gaaagaatat 
attatcacag 
ataaagtggg 
agaaaacgat 
tatgacaagt 
aaaatgctaa 
tgttatagcc 
gatatagctt 
ggcttcttat 
cttagcatta 
tcaccaataa 



gataacaaaa 

taeggattte 
tataegctta 
tgatagcttt 
gttgaaatcg 
ggacaaatgg 
cccaatgtat 



ataacgatcc 
taccgccaag 
atgtatcaag 
cacaggggtB 
tattatctca 
ggacaaataa 
attatcgett 
cttgtattcc 
aaaacaccag 
cgtacaaaaa 
aactaattca 
aatggctata 
acaaggcagg 
taggcaggta 
aattttaaaa 
gttgttatgg 
ttggtaatga 
atcttttaat 
aagaaaggaa . 
cctactgcta 
tagtcaaaag 
taaagataaa 
tattacaatt 
aaacaaatga 
ccaaacatta 
aattattcaa 
cttcttctta 
aaaacaaggt 
ttttataatt 
aatattttgt 
tctggttaat 
gttgtttaac 
ggtgaactca 
aagcatctga 
tiaaattggat 
aatccttcgc 
tgttttcacc 
aagccataat 
ogegcattat 
catatactac 



cttaacatta^ 
tettegcata 
taatagaatc 
aactaaaata 



cttcacaccc 
ttcttggcgg 
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ggaggtgtga 
acmtcac: 
ttcattttcc 
gtcgctaacc 
cgaaaccctc 
gcaaaagtcg 
agggggttca 
ggaactaace 
gaattaatgg 
cgatttcttt 
taacgttaac 
accaatcagc 
aaactcaatt 
agtaccagca 
aceaacacca 
aaaatttaca 
acgatttoga 
ttaattgacg 
tctcggatag 
ctattgcctt 
aaatgeegaa 
ttcgctatga 
tctaaaccca 



gtctgttgta 
aatcttcaca 
catcctgaca 
gttccatagt 
ctcatataag 
ttgacaccga 
atgac&acta 
aaaaagaagt 
cagagactcc 
taaaccttaa 
caaagaagag 
tcaggtgcaa 
tcgcaaaaga 
tggcttcgaa 
tcaatttttg 
gagatatcoa 
atgaaggagg 
tgtggcatgg 



acaatcctaa 
acagtcacga 
atcttacgtc 
caaatctcac 
atgcaagccc 
tccaaatcaa 



tatggaagtg 
ttgaagtacc 
tcttgccttc 
catatcttta 
ataatttcat 
aacttztatg 
gcgtagcaga 
tgctaaagca 
acaactccag 
gtttcgaaag 
tcgaaagaaa 
ttctcagtaa 
tctgtcgcta 
tcaattcate 
gagtgacact 
aaactatcat 
aactacaaat 
aaatcaatgg 
aaatatctaa 
atcttttctg 
gcaacgctat 
tatctagagc 



ttttgagaaa 
etctcttaac 
gtcaccaata 
tctaatttta 
tacgtcgagc 
gtgactaagt 
aggaataaca 
agattggaaa 
aatagcgatg 
aagttataaa 
aaactgtttc 
tactagcata 
atgcagtttc 
gaacctaacc 
gtgcaccaaa 
aaggagcata 
ctgacgcttc 
aaaaagtgat 
gogaataata 
caccagaaaa 
tcaattgttc 
gtagaaaatc 
tgatcagaaa 
agcagtgetg 
acegcgggat 
gctacttgtc 
tatgacctta 
gaagtttttg 
aattggaaag 
agagacacca 
gaagaacgca 
agaaggtatt 
gtagattcat 
aaccagttga 
gttccttaaa 
aagctatcea 
gaagctagac 
ccaccgagta 
cccacgagct 
ttctcaaaat 
aacgtagaag 
cggatccagg 
cgaaattatt 
tacggctcgg 



gacaccgaaa 
cccgxcccaa 



tgttaaacca 
tgtaagataa 

tgctgagtga 
agaaaaatac 
cgcccgcgaa 



aaagtgagta 
aagtggtcca 
ttcatcccac 
aatgaacact 
gaaagtcacc 
ggacaaacct 
ccciaccaga 
eaaactcatt 
cacgccgttt 
tgaaaacaot 
ttacacactc 
cccatcagat 
aacaaatgaa 
Ctcatcctat 
tacagctact 
acaaacccca 
cacatatcga 
ggagtatgta 
tatacattga 
gcataaaaaa 
egcttcacgg 
ccgcaagtac 
ggagcaagta 
caaoaaaaaa 
ggatatccaa 
tcggggcatc 
gaagaacaat 
tcaagttaat 
tttgaagaat 
caactgtaca 
agaagttaag 
aaAgcagata 
ctaccaaaga 
cactcaaata 
ccaaggcaag 
aagtgagtta 
tataagcgaa 
aaaaatataa 
caaatacgtc 
gaccttaaat 
gcgcatatga 
agataacttt 
ttagccagac 
agcacceatg 



tgacggatac 
aataacttat 
agctaatttc 
gttttattgt 
cgaccccgcc 
catcactgtc 
cctattttct 
cacccccata 



gaactaagat 
eatcagatac 
gaccaggaac 
ttcaaagtaa 
aggaacccag 
ttgtatgtat 
taaataatct 
atcaagggca 
cacaagatca 
ttaacggaaa. 
tagaaataaa 
acattcatct 
ggcgaagaaa 
gaagtacagt 
ctattcagca 
cggtaecagg 
tcccagcgat 
cgcaacatec 
acagtgcaag 
tactatcaca 
aaeacatgca 
tggcgtgttg 
tcgagttatt 
gattcgttta 
taaaactatt 
agagaaagtt 
gaaacttetg 
cccctgataa 
agagcattat 
gagctcatgc 
Cgaaggtacc 
catttcagtc 
atcaatatta 
cgaagccttg 
cctgaatctt 



gcaccagatc 
gaatgataae 



actgctcagt 
tccactgaag 
tcgcttataa 
cagaacacgt 
gcagaacttt 
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acgccgttac 
cagccaaaac 
cgtcaacttc 
tctttttctc 
tataaaagtc 
acgtaccccc 
taaaccatac 
accggaatga 
aagctaaaaa 
tgacaactaa 
tcatagcgaa 
agraagaatc 
ggaagattga 
aaaaagctta 
taattcagac 
Ctatatatct 
gaaactacta 
ttaaaagtga 
ttaaataagc 
cttttttctt 
ctttagcgaa 
tctaggtaat 
tgggagataa 
agtcatctat 
gaactgatgg 
atgcctcggt 
cccacaagat 
ctctaatatc 
ttcg&tgaac 
ccaaaaccac 
tattagatcc 
acaggagtac 
ctaagacagc 
ateaccaaca 
tggcaacctt 
tattgaagca 
acaacagtct 
agetttcaag 
atcctcagca 
ttgtagagta 
tgctatcacc 
gaaggatcgc 
aggaacgtaa 
caaagtcctc 
aaaggagtga 
aattegtgaa 
atacaactgg 
acgggaacat 
aggactatca 
tgtactcatg 
atatcctata 
acgagcaact 
ttccgcaaca 
aaatccacac 
tcgaaccgca 
ggcaagatca 
gccaaccaat 
aaacaagaat 
aaagaagcac 
etgatctaac 
gaugaatCt 
gaaaaaattg 
aa&gcctagt 
gcagacaaaa 
tcaaacacga 
cagcgaagag 
agcagagacg 
ttggtactgg 
cattgaagtt 
ctaacgtacg 
agctaceaat 
cggtgaaggt 



egtctctgta 
cteggcagtt 
attaagtcgg 
agcttcttta 
ccgaaaacga 
aaatcaagtt 
ccaaaaataa 
gtagaagttt 
attagcagat 



cgcagtattt 
gataatcgag 
garattcatt 
aaccttcacc 



agaagttaga 
aataatgacg 
ggaagctcaa 
tgtacaagat 
ttgagtgaaa 
atgaaaagag 
agaaggctat 
aagaaagcaa 
gcacttaatt 
ctccctgtaa 
cgcaatcacg 
agcgagattg 
ccaaattata 
tact act at g 
tccgatctca 
tgatggttat 



gttacaaacg 
aaagcccgat 
atcgagtata 
cacctaaatg 
ggaggacact 
aatgctacaa 
actcagaaga 
tcacccgatt 
gtccatgacc 
gcgaatacaa 
agtttcagaa 



attaaaaaaa 
agtgcaagta 
tcccaataac 
tcateaccga 
caatatcgtg 




tgatacaaat 



tatataaact 
aaacgacaaa 

gcaacgcctg 



aatgatagtt 
cagccgacga 
atcttagaaa 
cgtctgaaat 
cttattcaaa 
cttccaaaat 
cggacaagac 
tccttaggtc 
gcacaggcgt 
ctcaggagcg 
Catgcattag 
tagagatgcc 
aaagttatac 
tcgaaatacc 
tgattaatat 
aatgagcgac 
ccgtttctat 
aggaacaccc 
gtcttaaata 
catgacaacc 
cctcaagact 
gttacattta 
aacgaacaaa 
tttaaggagg 
ttacgtgtgc 
caargccaaa 
taaagattat 
agaaataaac 
ccgattttac 
gcaagtggag 
gxccagctct 
aggcctaaca 
ctgcgcgaat 
acgatgtaat 
tgacgtcatt 
tcagccacag 
acattcatac 
ateaagatca 
gagcetaaag 
tgcaaaatgt 
gcccgacacc 
aactatcctg 
aaaagccagt 



atcataaaca 

cttgcctcta 
ttttacaatg 
ccccacggca 
cgctaccaat 
tgttacatga 
ccgtatctta 
tattgcaggt 
atattgttgg 
gataacaaca 
cccacaagtt 
gaaaeatcag 
atagagaact 
aaaaaccaca 
gcaacaccca 
accgcgaaga 
ttctaaatca 
acacataaaa 
acctcactac 
tcacgaagaa 
attatataag 
ccaactctga 
taagaaaaaa 
actgttgtag 
gtgatgaccc 
agttaatcaa 
taaatactga 
agatgaaaca 
gtc teat cog 
ttaacgaact 
gaatgegaga 
caaaacaacg 
tgctgaagaa 
eagtttgagc 
atgtcgaaga 
agctttattt 
atactttcag 
ataaccctca 
agccogcacg 
ctccaatggg 
caggtagtca 
gectcaaaag 



gattcactac 
gttettcett 
actaagatca 
aatacccata 
aaatattacc 
aaacaaaagg 
cgcacctaaa 
aagataaatc 
ctaaagtcga 
atggaacaaa 
aaggcgagaa 
aatcaataaa 
ccgcta&aaa 
ataccagaaa 
cccagcagca 
ctaactatcg 
acacgaaaac 
cataaagtgg 
atcaagtgcg 
acagaagagt 
cceettgcca 
ageaattttc 
acttaaagga 
taataggega 
aacaccattg 
cagattgaat 
ttgttaacga 
accaagtgat 
tttagcttct 
tcgttgaaat 
naacgcaacc 
cctcaaaagg 
tgacacggaa 
catcatttca 
aatgattggg 
gaeacttect 
Caaatagtgt 
catcttcgaa 
caatttaacc 
tctttagtaa 
ttatacacga 
tccagcaaag 
agcttatcaa 
tcaacacagg 
ataccaccaa 
cacaaatcca 
caatttaggt 
gaagagtatt 
gctacctatt 
ageatggeca 
taaagaaact 
gagttattaa 
Cgatgttgtt 
ggaugactg 
aacgtaagaa 
tagagaaata 
tggcaatatt 
actatcaact 
gcccaactgg 
caggaaaaga 
tggtgcggat 
ataaatgeat 
gccaaaccgt 
ggracatticg 
tcaataaagc 
gtaegtaget 



gttggaagct 



gcggtgtact 

acctagacca 

accatcatac . 

gaaccgacct 

ctgtagaccc 

ccaccacact 

ccgatgaagt 

cgtagaaggt 

eaeaaacctg 
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i aaaatagacg 



acgtagaagt 
ttgggtettt 
gtgctcgata 
atagecetac 
aacggttcag 
cccaaaacca 
cgcttcaatt 
gcattatatc 
gatttaccac 
aaggtaatga 
acaatggcgt 
gtaccggtag 
getatcaagg 
acaaggttta 
gctgcaacca 
gttttctaac 
ttggggtagt 
ggtgggaagt 
ttgaagcatt 
aaatggactt 
agrgaegggt 
ccaatgcttc 
aaaagaaaat 
aaggtcccag 
aatgacattg 
aaaatgcgcg 
agegtcaaat 
Bcgaaatgtg 
ccacaaaaae 
aagcacaacg 
gcgcactaca 
: aactagcact 
cetagcaaag 
I catgcaaaga 
: acatatccat 
aagacatttg 
agcaaccaac 
aattaactaa 
tgatgaaaac 
gceactttta 
acgagattea 
! egacagaact 
i gaaggagagt 
tgatgaagca 
: aacaaagagc 
i acgaatttgg 



) ggtttaccta 
i ogggtcaagt 



: cgtcgcaetc 
: gccggtttat 
1 acgcgggata 
: gagtaaettt 
: ttatgtatga 
: ggacacccga 



gaaatttgcc 
agaaatgaca 
gaccaacaca 
agcagagcaa 
acagttattg 
gattacttaa 
gtgtgaaaaa 
ctacggtgcc 
ccagataccg 
ccaatgttca 
actagcagta 
tcagatgtgt 
aaagcataac 
tggcgctgga 
gttgatagtt 
atactgtaaa 
gattgaaetg 
caagtcgttg 
tagccgagaa 
aggttttaaa 
aaggaaatcg 
ttacttctcc 
tgttgacatt 
tcggcggact 
atttagaggt 
agagaagrtt 
gcgatgaacc 
aaaagatgaa 
agagcggaat 
attcacgtga 
agcgcaacca 
catacggcga 
cgcaacaggt 
gcgaagttct 
atacttaaac 
gtagtgaatg 
acaeagcgga 
ettatctcaa 
gtgggcgcga 
tttatttact 
gtgtcaaagt 
agaaggagtc 
tttcttgcta 
tgaccaaata 
agcaagtett 
cggagaagaa 
caaagatggt 
gaataaccgc 
tcateaaaag 
atgctatgca 
cgtctataaa 
tagaatgoca 



ggtactggaa 
aactagaaat 
cccccaagac 
agtgattttt 



taactagccg 
tcatggtaat 
ggxcggagtc 
atggtagcta 



gctttaaaag 
ggogtaacgc 
atcccgaaag 
cctagtggaa 
aatttatggg 
tatcgttcaa 
atagtcggcc 
aaactaccat 
gttttatatg 
agaattccta 
atctatttaa 
ggaatgatgg 
cagaaaaata 
attgtgtgga 
gaaaaagaga 
acgaagaaga 
tccgtactgg 
gtaacagaaa 
cattgaaact 
aatgcaacaa 
acgrccaaag 
caactcctcg 
gcacctatca 
agagctcgaa 
ggagatagaa 
gaaaatccgg 
tcggtgtttt 
tgaagaagaa 
tataaatctg 
aagcattcca 
atggaacacg 
cgaagcaacg 
agcaagrgca 
gaaagagtca 
gaacaaaccg 
agatcgagga 
agatgcaatc 
catgaggagg 
gaacgaaatc 
aaaaggcagt 
tagtggtgta 
ttagggacca 
cttctggtat 
tgtcataaat 



ttattaagtc 
ctgcaataga 
cggaaagata 
ectctcagac 



tatgaagcat 



acgcatcata c 
gagcctcagc t 
gttagacect a 
gcaactgcaa «j 
atgtccatga t 
gaataagcct £ 
aaggattagg t 
cagaagtcga c 
taatccaggt c 
ctagaagaaa i 
tcacatatct < 
agaaaagacc c 
gagaaaaaat < 
aagaaagaga t 
tccgacgtca c 
agtagacatg i 
atagacttta t 
aatactcgtc t 
agcttttgac i 
gaectaagag i 
ccttcacggg < 
aeacatacaa i 
atgatgaaaa 1 
agctatcatt 1 
taaaaattgt 1 
gtaaccgaag i 
catcatatga < 
catcccaaac < 
gtccaaaaga . 
tgatgagctt i 
tgggataggt 
aaettggaat | 
aacaatcagt 
cgtgtataag 
aaagaagaca 
agcaggaaaa 
ataagacgga 
gaccaaaaca 
agtagtaaaa 
atatcaagaa 
agatggaaaa 
aaaggcgaca 
: tcgagagtgt 
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41371 
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41651 
41721 
41791 
41861 
41931 



ggagcaaaag 
tgagtgacac 
tctcaagagt 
tctacatttg 
agetcaaaca 
ggaogtaagc 
aagaagaagc 
tgaaaataca 
agctaggcaa 
cgaccttgaa 
tgtgtttagc 
tcattgaccc 
atctteccaa 
aateagaaaa 
ccgaacatgt 
ctgctagtcg 
acaaacagtt 
aaagatgttg 
gttcaatgtt 
tcctgcatat 
ttaaaacgaa 
attacttcga 
atttttcttt 
aetgacacat 
gcgacccaga 
aaccttcatc 
ggtggattgg 
tgcttgtgaa 
tacrccggrc 
aagcaatcaa 
catggtctga 
attgogtaat 
gggaaaatgc 
gttatatcga 



gcctcggaag 
gttagaaata 



tagcggagtg 
tttttcatag 



ctgataaaca 
agttcaaggg 
acacacacac 
ga&agagaag 
agggagtgtg 
gtatgcaact 
aaoatcagag 
aatageactg 
tgrtaagtag 
gcagttagta 
agatttaaaa 
tagacaaagt 
tttatctaca 
agaaccaaag 
gcggatctgt 
aacacttgat 
tgtttatatt 
atgtaaacgc 
tgatacaacc 
acctatgaag 
cagaatggcc 
agaaaagccg 
cctgatttat 
cgtcatacga 
cagtcttgat 
aatcgactac 
ttaacgacaa 
gacgttagaa 
gacccaaatt 
cacggaataa 
aaagatttat 
: tatcatccag 
tcaaatactt 
cgctcgagca 
aaacctgcct 
i aagcatatga < 
agttgaagct 
gaagattttc 



agctatgagg 
aactaaagcc 
tacgaggcac 
ggaaatgact 
gagcaagtat 
ccgaagtctc 
gaggtgccgt 
catttctcat 
aaaaataatg 
gctaaagaga 
cacteaaata 
cactggacta 
aagacccaac 
cggtggttat 
accgattatg 
caacacacaa 
agatgagtat 
tatcaaccac 
atttaccttt 
aacgtcttca 
ggaattgttg 
aogaaaaaca 
aaataacaag 
tcagtacgca 
ctagttataa 
aatgtctgat 
attacttcga 



1S2 

taaagacata 
ggtttggtgt 
cccatatgaa 
cacatcttaa 
aacgacacaa 
aagagcaatc 
aagttaaaag 



at at tat gat 
atggtaatag 
aaatatgtgg 
aagatgaaaa 
gtgttgaagg 
cgcgttggct 
gaaacaatta 
atcacaatat 
tgagtacaat 
ttaaaagaag 
ctgctcaaga 
gcatagagag 
gaagctattg 
ataggttaat 
gttagaccca 
agggaagaga 
gtgcattccg 
ttctactaac 
tttgcccatt 
tacacttata 
agcaatgcag 
gcaatgeagg 
aaggtacttt 
aatageatte 
atacgtcaat 
attcaggcaa 
tctaaataaa 
gacaecgaag 
gatgtaaatt 
aataggtggt 
ggcgtttggt 
tcatatctaa 
tattttcatt 



gttccttaac 
acaggtatta 
gaagacgatt 
ggtctgtgcg 
attagaaaga 
aaactcgatt 
ataagaaata 
aacacggtgg 
gatacaacac 
ctataaagtt 
aagtttggca 
gaatcgacat 
cacctggaac 
gaagcaagaa 
acagtgtaeg 
attgatttca 
tgtttttaga 
cactaaaaaa 
tggaaccatt 
acacagaggc 
atggccattt 
tctattaaaa 
tacaggaett 
tcgagaaagg 
tctgaagaaa 
tgcctgacag 
agaaaaaaac 
caaaaactac 
agaagttaga 
caaacatgat 
gaacgttgga 



tgcaagatta 



cctgaactag 
ttgatgaata 
ttatcaaggt 
cttgaagtgt 
tttctaacgt 
tggaaaagat 
ttgaataaat 
tgtaaaaagc 
ctctttccct 
taaaaagtag 
agaattgatg 
cagaaaaata 
aagaggtgta 
aagggaaagt 
tgttatggaa 
aaccaoatag 
tatggggcta 
atgttagtca 
taaatcatct 
tgatatctat 
gacatggttg 
agaaattacc 
atgggctcaa 
tactttaaac 
agatatatga 
agttgatact 
tatattttag 
ttcaactatc 
taagttagag 
aaagaaagaa 
atagtggaga 
cgggcacatt 
catagacaag 



tcttaacacg 
gatgttgata 
ttgttgaatg 
cttaagacaa 
tatggtgtac 
atacatcttt 
aaccgtaggt 
cctcgctgta 
tgttagggag 
caacaccgga 
tttagtgaaa 
ccagatagaa 
tacatccttt 
taaagaacaa 
ctatcaaaag 
gggaaaacag 
tagcacctaa 
gaaagtgtct 
gtaaccaata 
taattgatga 
actcattaat 
gtttatttga 
caacacatca 
acgaatagaa 
aaacaaacag 
aatcggaaga 
taacggtgca 
gaaattatag 
tactccaaag 
cattaagctg 
attgtttggt 
gaoaaaatca 



gcgttcaacc 
ccaatcccaa 
tgttaccaao 
ttttggtaaa 
ttagacaatt 
agatagcgta 
agacgttgta 
gttgtatcat 
aggttcaacc 
taaaggggta 
acacaaatat 
gagataacaa 
ttattattat 
acaaaa&tat 
gtaaataett 
tatgcaatag 
tatcaaoact 
acaagttgct 
ttagtcttag 
aagaaaatac 
actgtctaca 
agatttatag 
tagacagagg 
agttagcgaa 
gatatatgtc 
cagtcttatc 
agaaggaaca 
gtttatacag 
aggagtctca 
gtttaaggaa 
cttatagcac 
ttggacttac 
tacgactatt 



tcagatcgag 
ttatctattt 
atgttgtcga 
tagcattttt 
tacctagtca 
aaaggttcac 
aaatgcagtt 
caagattatt 
gacggctaat 
ctattatcat 
atcgtcattt 
ccatagaagc 
tatagaagat 
tctctgttct 
aatatgatcg 
ggcttggtct 
aaaatgtcta 
gcaaacgacg 
tatgactgac 
ataagtccaa 
ggcgtaaagt 
gcactggcct 
gataaaaxat 
gtaagactaa 
cagagcctac 
cgttatacct 
ctcateataa 
tggcgctcaa 
eaaagagcgc 
atctcgatga 
caaagctagt 
aatgaattta 
ggcaagacgg 
aacaaaagat 
atatcgtggg 
tgaatagaac 
tgactatatg 
gcatggtttt 
taacggaaat 
acaagttgac 
ggtacaacta 
atcoagagag 
taaataccat 



ggtctgtata 
tgtaagagat 
tcaaatgttg 
ttttggtgat 
ctgtttgctt 

taggggtaac 

aaattttgta 
agttaaatgg 
gccagaagga 
gtgcatcggc 
ttataagaat 
ataaagtgat 
tacagcattc 
aaagatacat 
gaacacctaa 
taaacggtca 
tttaaaagtc 
gattaacagg 
cgaaagactc 
catgtcctca 
taagcatgaa 
tgaaaaagaa 
gttgtagctc 
atgatgaaga 
aggccaacca 
gcaaccacat 
atccagcaag 
acggccactg 
actcaccaca 



ccaaggaggt 
tgtcgcatag 
ttgctactat 
agtaatgttt 



agttgttgat 
attaaattag 
attcttacta 
gatgatgcag 
ttttgaatta 
caattgtttt 
attggagtat 
tatgaaaatg 
atttaaaata 
tgatgtttca 
gactttatgc 
agtctgaaca 
tgccggtcaa 
atottatcta 
gactgcgttt 
cgcagatatc 
tcaactagta 
taaacgaata 
aagactagca 
acgatagaag 
accatgaagg 
tacggatccc 
gatgaagacg 
aaaatgatga 
aatagtaaat 
atcccaaata 
caaaacaaat 
tgatgatagc 
gccattataa 
atggacataa 
aactaccaaa 
ottacacttc 
ctgacagttt 
ggcagaactt 
cggt t tcgtg 
ataaagttga 
agttgaagtg 
tatgaacaag 
atactgagga 
agacttaact 
gattacattg 
atagtagagg 
tgaaggcaac 
gaaagtttag 
catcatttct 
gcaacattga 
ccaatgttgc 
cctctaacag 
tacaaggtga 
attatgttta 
aaaacatatt 
aatttgaaaa 
ggtaggtgga 
tgataatgag 
agtgaattgc 
gggttgatga 
agaaagaaat 
cgtgatcaat 
ctaagagtca 
aacacctagt 
gagtcttcat 
actgggagct 
agcgaaagat 
agaaaagtat 
agaatggggc 
cgtaagactt 
atattatcgt 
tagaggattc 
tgcagggcat 
gaattatacc 
tcatgaccga 



tttggggaag 

gtattatttt 

ctttgtgaca 

tttatgagta 

agattcaaca 

gcggagagta 

ggcagttgtt 

gcgatgtatg 

aggcgccgag 

attatattaa 

atctatattt 

acgaatgctt 

aagttgaacg 

atgaaaggag 

Cattcttatg 

agaagctagc 

agccgatata 

gtcacgaatc 

tgtctcatga 

agtgattcct 

gttggcatgg 

acgacgcgga 

tgttgattgg 

gataagcaag 

aagctataga 

etcaactgca 

gtaagcggta 

ctaaaacaga 

agttgttaaa 

agcgatgatg 

cagaaaccat 

tgaatgctta 

agtttaagaa 

gcgcagcaat 

acgtcttgaa 

aaggcattga 

atggtcctca 

agtctctgtt 

gcagctacaa 

ttgcttatgg 

tttcttaaga 

aactggtcta 

gagaagagtt 

atctccatat 

atcttegaaa 

aaagagacaa 

occcatggaa 

aaaagtggga 

aggacttaat 

tgagcgatgc 

tgcaacaaat 

aecaaattca 

ceatttttaa 

ataaatgaaa 

aaatgggtcg 

ttgtagaaae 

cagagatcot 

acatctggcg 

aaacacggtt 

agttgttaga 

agttgataag 

gatgcattaa 

ataaaaaaga 

aaggtctaaa 

ccaaatagte 

tcagtcgtta 

aagagaogga 

tatctggata 

atgaagaatt 

atcattaagt^ 

atacatgata~ 

ettataactt 

aaactataaa 

ggattaaacc 



taacacaata 
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42001 gatcaaagag tatataaagc cttacaaaac 

42C71 caagaatagc taagcataag caatggaggc 

42141 atttaaatat actgaaccag aaacaeataa 

42211 gagatactta acccaacgaa agaactagac 

42261 ttagaacaac tgagttaatg gcgacaaggt 

42351 cgaagcagcc gaaagcgagc acttaaagtt 

42421 aataaagoca agaagctaaa gatagaacaa 

42491 caatacgaaa gaactttgct aaagcgatag 

42S61 gcaaaaggcc cacaaatccg tagtaatatg 

42631 cgacataaat acatgaggca catcgctaag 

42701 cgaccaagca taacaacatt tataagcacg 

42771 agcacggaag aagttaagag agatagcacc 

42841 gatatcatao cagatgcaaa gattgtgcac 

42911 acttagataa cctaatgtca gtttgttata 

429S1 caatcccaag aaaattagag ttctaaaaat 

43051 cgcccatcgg cttaaaacgt tttttegccg 



I S3 

aaagaactaa cgcaagaaga actgatgaaa gctattaaag 
ataagatggg aaaggcgcca tatgacatta agccaggaac 
cccaaatgag aacaagaaag agacaaatag actgagaatg 
accaacattg tgtatggaec gctacaaaaa ggagagccag 
tatcgactaa taagatgtta cgtaacctag aagagatggt 
acctgaagat cacaagaaag taataaggtc aaagtattgg 
ataggggatg cttgtcacat gcatcgcaac acagccacta 
cgcaccatgc aggcaccaaa caacactgcg caaagattgt 
acagcatcgg aaagacgtat aaagtcacct gaaagttata 
cggcgtgtct ttzgctatgc aaccaaagag gtgtaagaga 
gtcgcaagcc acaccaatac gaccggccct atcattcaaa 
agatagagat aaccatcttt gccaaacgcg cttacgcgaa 
cacattattc atgttgatga agotttcaac aaagccttag 
gctgccataa caaaactcat gcaaacgaca acgacaaaag 
ttaaacaaaa aaattattca aacaaaactt tatgcccccc 
ggcaccggag aggcc 
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Table 8 



Bacteriophage 3A ORFs list 



SZO 


um 


PRA 


PCS 




R83 iKpllflCt 






100379 




1 


8515 . . 13488 


1657 


BCaggt acgg&tttBAgBBBaCttt 


tt 9 




100380 


3AQRPQ02 


2 


37667 . .40114 


815 


1 1 taaaat aatgaaagtjagccgaac 






100381 


3AORF003 




32188 . . 34149 


653 


t taaagaaattgaggrgtcaagaat 


ttg 


-?.*2 . . - 


100382 


3AORF004 


3 


174S7 . .19370 


637 


get attttattaaaaaoraaaaatac 






100383 


3AORF005 




334. .2034 


566 


b g +tift aaag at age t ca agaag Bag 






100384 






1S571 . . 17154 


S27 


ctttfcatt ticiaaf jioofaif rr a 






100385 




2 


19337. .20836 


499 


a t gat agt a aaac aagt t cagggc c 




_£22 


100386 


3AOKF005 




22176 . . 23630 




a« t «« 1 1 aggtgc cga cca 


_£*2 


1 tga 


100387 


Jnvnf UUr 




40726 . .42093 


4S5 


gcaaAtacttttataagaBtggtBg 


gtg 




100388 


JAUAE Ul u 




13491 . . 14738 


415 


y .ay s*^ 41 *' * t aco^ui via 




taa 


100389 


3AORF011 


2 


2039 . .3277 


412 


at t aaagacat aatgcgtt aaggag 


_2£3 




100390 


JAUAf tfl* 


2 


4001 . . 5209 


402 


aaafl>ftgagawwa<intT*>a>cyjcgo 







100391 


JAUXCU1.J 


_i 








at a_ 


taa 


100393 







14738 . . 15562 


274 




atq 


tB ? 


100393 




" — 


324 9 . . 4 034 


~Ht — 


cttgaa tt aagaaaatctt toaaag 


gtg 


tag 


100394 


itmirni c 

JnUJCVUi O 


* — 


2*jtjfi7 262*}!) 

<330 ' • >*PftfJ 




aagsagctaagaaaaaoataaaaat 




tga 


100395 


JnUXrvl ' 




6729 . . 7370 


""213 


t t a a t t t t t aaggag/gaaa t Bag c & 


atg 


taa 


100396 










aat aaaat aaaaag^taggtgataag 


atg 


taa 


100397 




~2 


]U(c 12128 


187 


t*Y At" A A A Ajirt* n a A A Afnt n*- 


~2£2 


_Ef£ 


100396 




— 






gc a gt aggaa 1 1 at ga eggg tcaag 


ttg 




100399 






24(111 2a*>l(; 


174 


9 1 at b e at b Bui v v i» ai w bb BgaaAgga a 


atg 


tga 




3AOKF022 




12193 1291R 




t • a tag & s c c«y & *y a c ulq tagg t 




tga 


KEEEm 


JMUKTV«J 


— 






aaaataaat oaaatj^jagaataatt t 


atg 


-122 


100402 


3AORF024 




215711 72174 




actaaataaaaataaggaggacact 


atg 


tga 


100403 




— 


i line ioci i 


— — 


t aagca t aagt aa tgg aggt at aag 


atg 




100404 


3AORF026 


-| 


icier itcn 





aagcaac t aact 1 1 at 1 1 1 aaggag 


ata 


taa 


100405 


JAUttr U * r 




Cfl OA £^QQ 

svoa . . oiyu 




a t a t tggc t a t aat a c agt ggt t t t 




taA .. 


100406 


3AORF028 




27845. .28255 


-—2 


cctttt aagatgt -tatgatccttt 


ctg 


taa 


100407 


3AORF029 




34344 . . 34 748 




ttaaggttttagatttagaggtgga 


atg 


taa 


100408 


3AOKF0 JO 








tataaaaaaggagttggccagataa 


atg 


_t*a — 


100409 


3AORP031 


1 


20833 . .21225 


130 


ttaacaaaattataggagtgagaaa 


ata 


taa ] 


100410 


3AORF032 


-2 


39964 . .40361 


125 


aaat agct gt t agagggt tacccct 


ata 


tag 


100411 


3AORP033 


1 


7957. .8325 


122 


gaatatctgcgtcttttttatttga 


ata 


taa 


100412 


3AORF034 


-2 


28506. .28871 


121 


gttatcaacctaaggaggtgataac 


atg 


tag 


100413 


3AORF035 


-2 


1C671. .11036 


121 


tcctagcttcctaacagcaccgcca 


ata 


tga 


100414 


3AOR703 6 


2 


30020. .30362 


120 


accaattttaaggag^agttaatca 


atg 


tga 


100415 


3AORF037 


2 


21818. .22165 


115 


aagcgt aagt aa t agt t a agagt ca 


gtg 


tag 


100416 


3AORP038 


-2 


42003. .42347 


114 


gt actcac 1 1 1 caactgc 1 1 caacc 


ate 


tga 


100417 


3AORF03 9 


2 


21366. .21727 . 


113 


tccagaaaatctagagtcataggtt 


ata 


taa 


10041B 


3AORP040 


-3 


296S4. .29995 


113 


ttgattaactcctccttaaaattgg 


"a 


taa 


100419 


3AORP041 


-1 


4333. .4671 


112 


tactaaatetacatctgatccatga 


att 


tga 


100420 


3 AO RFC 4 2 


3 


5568. .5900 


110 


t aaaaaagtggt aggtga 1 1 1 1 1 aa 


atg 


tga 


100421 


3AORF04 3 


1 


25690. .26019 


109 


tacoaaattaatatagtcttcgcat 


ata 


tag 


100422 


3AORF044 


3 


29676. .30005 


109 


gtcttaaataattatataaggagte 


att 


taa 


100423 


3AORF04 5 


3 


30. .353 


107 


cgctagcaacgcggataaatttttc 


atg 


taa 


100424 


3AORF046 


3 


27894. .28214 


106 


aagatattgaaaagctaatttcccc 


ata 


tga 


100425 


3AORP04 7 


-2 


11907. .12227 


106 


ttcgccgcca aaat gat tagcattt 


ctg 


tga 


100426 


3AORF048 


-3 


40343. .40663 


106 


ccataacacatacactgtatgatct 


ctg 


taa 


100427 


3AORP04 9 


-3 


6749. .7069 


106 ! 


tgttaaaccatcttcagattctcca 


ata 


taa 


100428 


3AORF05O [ 


1 


42700. .43014 


104 


t tatgeaa t caaagaggtq taagag 


atg 


taa 


100429 


3AORF051 


-2 


13077. .13388 


103 


ttgtacgtaatcccacacatcgccg 


ott 


tga 


100430 


3AORF052 


•3 


3722. .4024 


100 


gcatttcatttcctcctaataactc 


att 


tga 


100431 


3AORF053 


3 


17145. .17444 


99 


t cgaga caa t ggat a tagggagt gt 


att 


tag 


100432 


3AORF0S4 


-1 


19915. .20211 


98 


ataatttategcttgegaaacataa 


ata 


tga 


100433 


3AORF055 


-1 


42436. .42729 


97 


aatcgtattgatatgacttacgacc - 


atg ... 


.-tag' 


100434 


3AORF056 


3 


40455. .40745 


96 


t aaat t t t gt a t acaaggtgaat a a 




tga 


100435 


3AORF057 


-1 


38665. .36952 


95 


atcatcacogtcttgccattgacgc 


att 


taa 


100436 


3AORFOS6 


-1 


21265. .21549 


94 


gaaatttctatctaacttgtcataa 


att 




100437 


3AORP059 


-2 


10278. .10562 


94 


tttagccgcgettccaactgcacgt 


att 


t *9 


100438 


3AORF060 


1 


5278.. 5556 J 


92 


a tat cagecg aataggggtgatgaa 


atg 


tag 


100439 


3AORP061 


1 


35668. .35946 


92 


t t t ggaaagaaggagagt tgac c aa 


ata 


taa 


100440 


3AORF062 


2 


35912.. 36187 


91 


gttaaatttggaatggaattaaaca 


ata 


taa 
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100441 


3AORF063 


3 


36720. .36995 


91 


cggaaqtagcggagtgtaaaqacat 


act. 


tga 


100442 


3AORP064 


-2 


35694. .35969 


91 


ccqttatacgcgctagcactaataa 


ccg 


taa 


100443 


3AORF065 


-2 


32697. .32972 


91 


aaccgttttcctttgtaaattaggt 


ata 


taa 


100444 


3AORP066 


3 


29157. .29429 


90 


caaact t taacat 1 1 at ct aaagga 


gtg 


tag 


100445 


3AORP067 


-2 


26661. .26930 


89 


atacttttttagcggaatcggatga 


ttg 


taa 


100446 


3AORF068 


-2 


9624. .9893 


B9 


ttttaatpcacctcccatgtattga 


ata 


tga 


10044*7 


3AORP06 9 


-3 


13847. .14110 


87 


tgcatttcctcctgattcgtgttga 


ate 


tga 


100446 


3AORF070 


1 


34993 . .35250 


85 


c 1 1 acgcccaaagogc t 1 1 tgacc t 


gtg 


taa 


100449 


3AORF071 


2 


34745. .35002 


65 


aaatgttcaagaaatggagtgaagc 


ata 


tga 


100450 


3AORP072 


-1 


27379. .27636 


85 


tctgtcgttcctcctctaagttgcc 


ttg 


taa 


100451 


3AORP073 


2 


37367 . .37615 


82 


tggtaatagctattatcatttttga 


att 


taa 


1004S2 


3AORP074 


-2 


23466 . .23714 


82 


cgtttgtttttttaaaattcaatat 


ate 


taa 




3AORP075 


-3 


2471. .2719 


82 


agtactgtttg&aatcttctaacac 


ttq 


tqa 


100454 


3AQRF076 


\ 


26047 . . 26292 


81 


aagtacgttttcttggcggggaggt 


gtg 


tag 




3AORF077 


2 


26292 . .28537 


Bl 


aacatcttaaaag^gaggaataacaa 


acq 


taq 




3AORF07B 


. 1 


5836 . .6075 


79 


ttttgtataaggcttagatttagtc 


att 


taa 






. 2 


5460 . .5699 


79 


actcagtcgcctttaaaatttctct 


ate 


taa 


100458 


3AQRF060 


- 2 


31350. .31566 


78 


ectgtaatcactttagttttattta 


ata 


taa 


1004 59 — 


JHUKC VOX 





8252 . . 6486 


78 


aagt tec ct t aaa t ccgt acctgt a 




tQa 








J3Jv3 . • Jo A JO 




atac tt atagacaacttqacccgcc 








3AORF063 


-\ 


34039 . .34272 


77 


ataqttcacctggattattaaataa 


~aTa^ 


tqa 




3AORF0Q4 


_ 1 


12007 . . 12240 


77 


acatttttttcattccgccgccaaa 


atg 


taa 


"100463 


3AQRF085 


- 2 


32367 . . 32597 


76 


cttacaaggtatagagaaataacga 


att 


taa 


- — 


3AORF086 




30618 . . 30846 


76 


a t a t aa t ct aagt tgagga t t a t c t 


ata 


taa 


g — 


3AORF087 




24746 . . 24973 


75 


ataggtcttaagttcaccctcttca 


ata 


tqa 


- — 
1 00400 


3AORF068 




12980 . .13204 


74 


tcttt ctttttcgtaccaccatgga 




taa 


-tsStts 


3AORF089 





4290. .4508 


72 


acaggagaagcttatcaatctttaa 




taa 


1 004 o 8 




— 


26926 . .29141 


71 


1 1 at a cacgaa aggag cat aaac aa 


ata 











USA"! 13H02 




ctegt ettget aactgee tagat aa 


atg 


tag 


"lb0470 " 


3AORF092 


~ 


26471 . . 26683 


70 


aaacga aacaa aagg agggggt tea 


atg 




— - - 


3AORF093 


TT — 


2524 . . 2736 


70 


tccaccgt tttctt catagtactgt 




taa 


"*1 004 72 — 


3AORP094 


■"T3 — 


25334 . .25546 


70 


t ggeget ttaatataaa agacgt ct 




taa 


1004 73 — 







8316 . . 8525 


69 


aag aga t gggaaaga c agaagaaca 


"ate 


taa 


"100474 — 


3AORP096 


2 


36992 . .37198 


68 


aacaag 1 1 caaggqagc t a tgagg a 


ata 


taa 


- 






32593 . . 32799 


68 


aaagc 1 1 aat acct ctgt cgt c tat 






100476 — 


3AORF098 




15346 . . 15552 


68 


aatccattaaatcacctacctataa 


ata 


cag . 


—100477— 


3AORF099 


1 


7225. .7428 


67 


actggtgactggatgaacagaaaag 


tta 

_ a 


.. t ?9 






" 


22620 . . 22823 


67 


cgacttcat gaccggcatgtcttaa 


ata 




"' 1004 7 9~ 




* 


40060 . . 40260 


66 


aa cc 1 1 a cagegaga agggaaagag 


gtg 


"Taa 




3AQRF102 




1^035 35*23*9 


66 


1 1 ct at c t cct caaaac aaagt t ag 


tta 


~taa 


















10 ^ 4 " 










aaacaac t t aaaggaggaacgacaa 


atg 


tga 


4B3_ 


i&m>m ft c 


2 


94 20 • . 9617 


~65 


gec t aagt caaccgcct gat tagac 


atg 


tga 









*}*4344 3141ft 


— 


caccagtaat tct tgaattagt tga 






ioc4b's"~ 


3AORF107 




11966 . . 12157 


63 


t ct aaa a aaga t get gtagtagacg 


tta 


"taa 


" — 
1 00 4 86 




— 5— — 
— ^— 


35054 . .35245 




t tt t ca t cat t tct acct cct caaa 




tag 




■jtnon no 


* 


16010 . . 16201 ' 


63 


gttctt aattccaatqtactgacag 


tta 




1004 88 


3A0RF110 




6184 . . 6372 


62 


atttt cagtgactttat aatagt at 


att 


"taa 


1004 0 9 


Jf\kJJ\Z ill 


— 


16500. .16686 




gtagtcaacaattgctttgtattga 




..3. 




3AQRP112 


~"^2 


8S02 . . 8690 


62 


cttaattctcgcctgatacttttcc 


att 


taa 


100491 


3AORF113 


1 


34162. .34347 


61 


t at gaaggat t aggagt gt gat c gc 


acg 


tga 


100492 


3AORP114 


2 


12356. .12541 


61 


ggatatcacactaaggctatagcta 


ata 


taa 


100493 


3AORF115 


-2 


7635. .7820 


61 


tgaagt t ccct cagct a caccgt ga 


att 


^ 


1004 94 


3AORF116 


-1 


26434. .26613 


59 


1 1 1 age 1 1 ctgaagt tgt aaaat ct 


ctg 


tga 


1004 95 


3A0RF117 


-3 


17604. .17983 


59 


atagccattatttctagcttgtgtc 


atg 


tga 


1004 96 


3AORF118 


2 


27699. .26075 


58 


attgaaaagctaattcccccataag 


att 


taa 


1004 97 


3AORF119 


-1 


39268. .39444 


56 


aegaaaceggt caact t g t t t agat 


atg 


tga 


100496 


3AORF12Q 


-2 


37152. .37328 


58 


tagctattaccatgaaacttcagct 


ctg 


taa 


1004 99 


3AOR7121 


-2 


18900. .19076 


58 


aaggt act ct ct cccat 1 1 accact 


att 


taa 


100500 


3M3RF122 


-1 


21550.. 21723 


57 


taagcatggeaatcacctccttcaa 


atg 


taa 


100501 


3AORF123 


-3 


33062. .33235 


57 


aaa eg t tgt t ct t t aat aagat c t c 


ttg 




100502 


3A0RF124 


2 


21212. .21382 


56 


aaat t agaagaggt taaaggagaga 


ctg 


tag 


100503 


3AORF125 


-1 


22051. .22221 


56 


aaatcaggattgaactgcttcccta 


atg 


tga 


100504 


3AORF126 


-2 


7821. .7991 


56 


tgtttttcctgttttacggtcttta 


att 


tga 


100505 


3AORF127 


-3 


34712. .34682 


56 


ttgcattacctattgcgaatgctag 


ttg 


taa 


100506 


3AORF12B 


-3 


24056. .24226 


56 


tttttaaaatcaaagcgtctttgct - 


ata 


*taa 


100507 


3AORF129 


-3 


4940. .5110 


56 


cataccatgcagttaatacaaacaa 


ata 


tga 


100506 


3AORF130 


3 


27171. .27338 


55 


cagaat t aa c t at cgatgat 1 1 cga 


atg 


taa 


100S09 


3AORF131 


-1 


40367. .40554 


55 


ccttctggcataataataattctat 


eta 


taa 


100510 


3AORF132 


-2 


I860. .2027 


55 


gcgacaacactcacctccttaacgc 


att 


tga 


100511 


3AORF133 


-3 


42317. .42484 


55 


acaaagt t ct t t eg tat tg tag taa 


etc, 


tag 


100512 


3AORP134 


2 


12671. .12B3S 


54 


t catacaaa t c t t t aaaaggt tgga 


etc 


tag 
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100513 


3AORF135 


-1 


39484 . .39648 


54 


ataatagt atttagcctctgcccag 


ACC 


taa 


100514 


3AORP136 


1 


29710. .29871 


53 


aCCCCACAACAAAaAAtACtAtCaC 


Att 


taa 


100515 


3AORP137 


1 


37166. .37347 


S3 


ggeagt tgt t tgaaaa t At OAagga 




taa 




3AORF138 


2 


20996. .21157 


53 


AAtocraaaaataatttttaacaaaa 


ACt 


taa 




3AOH7139 


-| 


15114 . . 15275 


53 


t CABCtgBBBttg BAQt AAgt t t A A 


-*£2 


_£22 


100518 — 




— 


29442 . .29603 


53 


AAAAtggt Att AggAggAttAtCAA 








3ftOR.Pl 4 1 




40044 


53 






"taa 




3AQR71 4? 




20416 . . 20577 


53 


ACC A C CC 99 AAA Agt CCC AC A AA AA 




_tga 




3A0R714 3 




1942 . . 2103 


53 








100522 


3A0RF144 






53 


CtCCA CC&gtttCAt CtCt CAAQAA 






100523 


3AORF145 








t Ct gag 1 ggt C AgAA C t Ag CC ACC A 


-212 




100524 


3AOR7146 




ii cn «i c 

J JO 0 < . 4319 




AA CA C gt CC At A C C AC 9 A AC A AC C A 


att 


tga 


100525 


3A0R7147 


-3 


Sells • ■ a 




9C <JA t C C gt t t gtggt A9AC At C C A 


Att 


*9* , , 


100526 


3A0H7148 


_i 


J4141>> . J*JUU 


-|i 
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taattcagtcttaggagtatcattt 


att 


tag 


100693 


3AORF315 


-2 


13990. .16097 


35 


acat a t ct cegtat ca 1 1 tgggtaa 


att 


tfig 


100694 


3AORP316 


-2 


2662. .2969 


35 


aattcttcttcatactgtttgacga 


ttg 


tag 


100695 


3AQRP317 


-3 


40217. .40324 


35 


t ccct aacact act 1 1 1 taaact 1 1 


ata 


tga 


100696 


3AORP318 


-3 


37535. .37642 


35 


tgt tcggctcctt teat tat tt taa 


ata 


P" 


100697 


3AORP319 


-3 


34421. .34528 


35 


ttcttcatcttttatttgactctgc 


ata 


tga 


100698 


3AORP320 


-3 


28262 . .28369 


35 


catctgttggcaatatctcagttcg 


atg 


tga 


100699 


3AORP321 


1 


23989. .24093 


34 


taaaaaggt t t aat ataaaaa t gt a 


ata 


tga 


100700 


3AORP322 


1 


34660. .34764 


34 


aagagaagattgagaccatggcttt 


atg 


taa 


100701 


3AORP323 


3 


30105. .30209 


34 


ctaaacactgaactatcaactgtag 


att 


taa 


100702 


3AORP324 


3 


30258. .30362 


34 


ggaaaagagt t ccttaaaaaagcag 


ata 


tga 


100703 


3AORP325 


3 


40236. .40340 


34 


gttgtatcatttctggtgatgcaac 


act 


tag 


100704 


3AORF326 




36964. .37068 


34 


cgcaccaacaactgtaaacctttga 


ttg 


tga 


100705 


3AORP327 


-1 


35242. .35346 


34 


aetettgecegttgtataacatttt 


ctg 


taa 


100706 


3AQRP32B 


-1 






ccac 1 1 acct t c tt gagatgt t gga 


ttg 


tga 


100707 


3A0R7329 


-1 


18630. .16924 


34 


ggtggcttaacttccaa^BBceaac 


eta 


taa 


100708 


3AORP330 


-1 


15631.. 15735 


34 


t catgaagcec tcacaaat tagtaa 


ate 


tag 


100709 


3AORP331 


-2 


37998. .38102 


34 


t tacgcccaatagcttcatact cat 


ctg 


tag 


100710 


3AORP332 


-2 


7359. .7463 


34 


tttataaacctttaaagttttagtc 


ata 


taa 


100711 


3AORP333 


-3 


24584. .24668 


34 


aaaaattataaaactataaaaccat 


ate 


taa 


100712 


3AORF334 


-3 


24269. .24373 


34 


tatctttaggcagataacctattaa 


ate 


tga 


100713 


3AORP335 


-3 


14273. .14377 


34 


cacttcagcaagttgatgctttgta 


ate 


tga 


100714 


3AORP336 


2 


7559.. 7660 


33 


gtaactttatctaatttagaagegg^ 


ata 


"9 


100715 


3AORP337 


2 


13277. .13378 


33 


aatacaggtaaaaaagcaggagaat 


"9 


tag 


100716 


3AORP338 


3 


9501.. 9602 


33 


taggaegtacgatgacgatgggcgt 


ate 


taa 


100717 


3AORP339 


3 


27348. .27449 


33 


at a t ct aa 1 1 aaat aagegcact ta 


att 


tga 


100718 


3AORP340 


-1 


37372. .37473 


33 


ttctatggttttcatcttatgagaa 


atg 


taa 


100719 


3AORP341 


-1 


33421. .33522 


33 


aagct aat t eggaeact 1 1 1 cct tt 


"9 


taa 


100720 


3A0RP342 


-1 


29047. .29148 


33 


tttggcatctctatcactcctttag 


ata 


taa 


100721 


3AORP343 


-1 


7549. .7650 


33 


a tgat aegee tgaga ct agaa 1 1 gg 


att 


taa . 


100722 


3AORP344 


-1 


7397. .7398 


33 


ctgetgaaact gt tgcaga 1 1 1 1 ga 


att — :■ 


'tga 


100723 


3AORP345 


-2 


23850. .23951 


33 


t t aaac ct t t t t aact t t t aataaa 


art 


taa 


100724 


3AOSP346 


-2 


20607. .20708 


33 


aaagatgtacgactagatttagtta 


ate 


taa 




-2 


14175. .14276 


33 


atctgetgteaaagaacgetaataa 


ctg 


taa 




-2 


6984. .7005 


33 


cgt acactggt tgacctgt t aaacc 


ate 


"9 


■r.T.v^TTTTTrrr— 


-2 


6882. .6963 


33 


tagaacgaccaataactgtatttag 


ate 


taa 






-3 


40748. .40849 


33 


aactgcaattcactaaatgctgtaa 


9^ 


tga 
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ISO 



100729 


3AORP351 1 -3 


3834S.. 38446 1 33 


ggttagtagaatgtttttcgtataa 


ate I taa 


100730 


3AORP352 1 -3 


30081. .38182 I 33 


tagttgaaggccaatacattaacct 


atg j taa 


100731 


3AORP353 J -3 


35432.-35533 


33 


tageaetctcatatgacgcagatct 


•ca t taa 


100732 


3AORP354 j -3 


34952. .35053 


33 


ttatcctgatacagatacctctcag 


ate j taa 
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Table 9 

Bacteriophage 96, complete gonoc&o sequence 

l catagttata ggcttttcag ccatataeca agataagatt tatcccgccg tccccataaa aatatgcttg 

71 gaaaccttga tctaatgggg ttttaaccta gcaagtgtca aatatgtgtc aagaaaataa ctttcegaca 

141 cgttgacctt gctctttttc atgttcatca agtaagtgag agtaggtgcc taaagttata gatatattat 

211 aatggcctaa tcttttgcta atatattcaa taggratacc tttagaaagt aggaaagacg tatgcgtgtg 

281 tcttaatgaa caaggtgtta ttgtagtatc attcagtcct atttgactct tagcatgget aaatgacttt 

151 ttaacggcat catgactcaa tttaaacaac ttattatctg tacgttttgg taattttgat aatttagctt 

421 taatatgtcg tatatccttt tctggtacct ccacaagtct gtccgcgtca actgtttttg ttccacgaag 

491 acgtattgra ccctcttttt cgtctagatc gataggcaac atattaacta catcgctgta tcttgcacca 

561 gtgatagcta ggatgaataa aaaaatataa ctcgattcgt ctctagattt aaagtattct atcaattgca 

631 agtattgttc tatggtgacg aatttagagt gttegtcttt cgattttttt gtaccacgaa tatctatttg 

701 atagccaggg tctttcttta aatagccctc atatactgca cctctgaagc attgtgataa acaactgttt 

771 aatccacgaa ccgtttcatc agtacgacct cgaccgaatt cgttcaaaaa cttttgatac tccgaacgtt 

841 cgatgttttt tattaaaaaa tcactcccga aatattcgtt aaataatttt aatgaacgtt gataccaata 

811 gaattgttgt gaagogacat gtttcttatt ccttgaatct aaccaaccat tgtaatatcc ttcaaacttt 

981 ttattttcat ctaaattgtt tccatcatcc aaatctctaa gcagtcgttg agcagcgtcg gttgcctcag 

10S1 ctttagtttc gaaecccgac tttcttttct tccctgatcc gaaagacgga tgttttacgt cgtactgcca 

1121 agatgctgtt gctttattct ccctttttgt aactgtaaat gacgccattt tacttttcct cctcaaaatt 

1191 ggeaaaaaac aataagggt* ggo9"9ctae ccgaaatttt attgttgaac aactattgcc tcacctcttg 

1261 cttttcctac ttctttccta aaactarcat atgattgatt agggtgtgtt aacgaeattc ctggaccacc 

1331 tccagcatgt tggtttttgt ccggattact ttccatttct tcagtggctc ttttagcatt taaatattct 

1401 tcgtaactag gttcgtttgg gtcgcgtggc tgtgcccgtt gteeattatt ggtagctgga agactcttct 

1471 gtacctgttg cctagatgtg ttattggttc gttgacegtt gttaatgttt gtgttgttct cgccgtttac 

1541 ttgattattg ttatcgtttt gattaceatt ctcttttttc gcttctgctt tatctetagt ttctttcttt 

1611 ttgtctttgt tctctttctt tgtttcggcc ttcttgcttt cctctttctt atcgccgtcg tcgccaccgc 

1681 atgcacctaa cactaacgca ctagctaata ataaaactaa taatcttttc atgttttaca ctcctttatt 

1751 tgctatttgt tctaataaat ctatgatttc attgttttgt tctatgactt tgttttcatt tttaagatgt 

1821 tcgtctaaea tctctattaa gacgaaattt tgatttatca tttcgtaagt aaacatttga ecegtgttgt 

1891 taggatcaga aaaogaacta ctgaaacgcg ttgaaaagct atctataaat tgaccaactt tattttttaa 

1961 taacatatct ttaccgctct cagacattgt atttagttcg cgcttatcta aagttttttc tataattttg 

2031 tattttgttt cctgacttct ttcgatttct tctacttcaa aagggatact gttattaaac ttttcgataa 

2101 catcacgttt ttcagaaact gacatacgat caaataettg tttttgacct ttatttaact tccctcgaat 

2171 ttttccggca gtccaagact cttcaaetgt taacttatca tcaggaaett gattcatctt tcatacgact 

2241 ccttttctca tatttcttta tatttaaaaa ctctcaacgg ctcaaatgta accgaacact cgccatagtg 

2311 agttccaata ccgtatatct cctcatattg ttcrattgcc tccaatatgt attcctcgct taattgtaga 

2381 tactcagaca actcatacaa gctacgtacg ccataattgt aagcttctac aatttcgcgt aacgggactg 

2451 ctgagataaa gccgtgtcgt cttgcgtaat tttcgaactt gcgattgttg aatttcgatt gatctaaaat 

2521 gttgccatac gtcaactcgt ggtgggcaag ttcttcatat aatacttcta atttgttccc tccggataag 

2591 gaaggtctaa taaaaatttc eccttcttga caccaaccat cgaatcctcg aggtactctt tgtgtttctt 

2661 ccacttcaac ttcacatttc ataagcaatt cttcgtattt tcccacgcgc caaacccctt tggtgtctta 

2731 tctcttccta tctctaaccc atcgcacaaa attctcgatt tcttcccatt cttcgggagt aaactcatct 

2801 ttattcgcat gaccggctat agcttcttga tgaatacttc tttcttccgt aattctcgat tcaggtacat 

2871 taaagtaaec tgctaattgt tggacttttg atattctagg atattcaagt tctttaagcc agttagagat 

2941 tgttgattga cttaccccga ttgcttcaga caattccact tgagtaatgt tgttctcttt cateagttgt 

3011 tctaagttct ctgataaaat ctttctagca ctcctatatt ccacaattct ctcctttagt attacttaat 

3081 gtaacactaa tttaccataa gtaatatcac tttecaatac aaaacactac ctttttgaaa taaatatcac 

3151 ctcaggcgct gacatatcac tttaagtgae agtacagttg taaacgtcaa cgggaggcga tacgaaatgc 

3221 cagaaaatet taaagagtcc tctgtaaagg tctggagaac taactcgaat acgacacaac aagatgtcgc 

3291 tgataaatta ggcgttacta aaeaatctgc aataagatgg gaaaaagatg acgcagaatt aaaaggecca 

3361 caattgtatg ctttagccaa attattcaac acagaagtcg attatataaa ggccaaaaaa atttaacatt 

3431 aatatcactt taagtgataa aggaggaaac tgaaatgcaa gaaecacaaa catttaattt tgaagaatta 

3501 ccagtaagga aaattgaagt ggaaggagaa cccttctttt taggtaagga tgttgctgaa attttagggt 

3571 atgcacgagc agataacgcc atacgcaatc acgttgatag tgaagatagg ctgatgcacc aaattagtgc 

3641 gtcaggccaa aacagaaata tgatcatcac caacgaatct ggactataca gtttaatccc tgacgcttct 

3711 aaacaaagta aaaaegaaaa catcagagaa accgctagga aatccaaacg ctgggtaacc tcggaagttt 

3781 taccgacgtt aagaaaaact ggtgcttacc aagtacctag tgacccaatg caagcattga gattaatgtt 

3851 tgaagctaca gaagaaacaa aacaagaaat taaaaacgcg aaagatgatg ttattgatct gaaagaaaat 

3921 caaaaactgg atgcgggaga ctacaatttc ttaactagaa caatcaatca aagagtagct catatacaaa 

3991 gactacatgc gataacaaac caaaaacaac gtagcgaatt atteagggat attaattcag aagtgaaaaa 

4061 gatgactggt gogagttcaa gaacgaacgt aagacaaaaa catctcgacg atgtaactga aatgatcgct 

4131 aattggttcc cgtcacaegc tactttatac agaaccaagc aaattgaaat gaaattttaa aacgaaatat 

4201 aggagaggct gaatatggaa tacatcggat atgcagacgc aaatgegctt gtaaaaataa gtggcatttc 

4271 aaaagatgat ctagagaaaa aagtctaccc gaacaaagag ttecaaaaag aatgcatgca cagacttggt_ 

4341 cgaggacaaa agcgttacat aaaaattgac aaagctactc aatttatcgg taccaattta acgaccaatg 

4411 aataogaatt ataggaggag ttatcaaatg agtaaaactt ataaaagcta cctagcagca gtaccatgct 

4481 tcacagtctt agcgactgta cttatgccgt ctctatactt cactacagcg tggtcaattg caggatcegc 

4551 aagtatcgca acattcatat actacaaaga atacttttat gaagaataaa aaaactgcta cctgcgtcaa 

4621 caagtaacag tgacaaacat ttaccaaaat atacaactta actaaatcaa aatatacgga ggcagtcaac 

4691 tatggccgaa aatattaaaa ctgaacaaca ttatcacact aaagatttct caggatacag aaatgaagaa 
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4761 gataactttg tagcaaatca agaategaca gtaacaatca cattgaacga gtacagaaaa cctattgaaa 

4 831 eaaaggctgt taaagacaaa gaagaagata cctacagagg taagtatetc gcggaagaaa gaaaaaacga 

4 901 aaaattggaa aaagaaaata taaaactaaa aaacaaaacc tacgaatcac aaaacgaaga agataacgag 

4 971 gaggacgaag aagacaagga ggacgagaac gatgtattac aaaattggtg agacaaaaaa caaaattata 

5041 agctttaacg ggttcgaact taaagtgtct gtgatgaaga gacatgacgg tatcagtato eaaaccaagg 

5111 atatgaataa tgttccactt aaaccgtttc acgtcacaga cccaagcgaa ctatatatcg cgacggacgc 

5161 aatgcgtgac gttataaacg aacggattga aaaLaacaca gatgaacagg acaaactaac taacttagtc 

5251 atgaaatgge aggaggtacg aaaagtgaat gaittacaag agagagaatt agaaacattc gaaeaagacg 

5321 accgattcaa agxaaccgac ccagacagtg ctaactgggc ttttaagaaa ctggatgcaa tcacaactaa 

5391 agagaatgaa aecaacgacc cagcaaataa agaaactgaa cgcacaaacg aacggaaaga caaagaagta 

S461 gaaaaatcac agagtggcaa agaatattta caaagccttg taattgaata tcacagaaca caaaaagaac 

5531 aagacagcaa actcaagttg aacacacctt acggaaaagc gacagccaga aaaggttcaa aagtcattca 

5601 agccagcaat gagcaagaag teattaaaca acccgagcaa cgaggtctcg acaactatgt aaaagtaact 

5671 aaaaaaccta gccaaccaga eattaagaaa gattccaacg caaccgaaaa oggcacaccg attgacgcaa 

5741 acggcgaagt tctagagggt gctagcattg tggagaaacc aacgtcatac eeggtaaagg cgggagaata 

sail gatgactgaa aaaaccaatc aagatgtcga tatttcaacg caactaggtg caaaagacat cagcaaacaa 

5B81 aatgcaaaca agtttcataa atttgegaca tacggcaagc ccggtactgg taaaactacg tttttaacaa 

59S1 aagataacaa tacctcagta ctagatataa acgaggacgg aacaacggta acagaagacg gggcagtcgt 

6021 gcagaccaag aatcacaagc attttagtgc agtgateaaa atgccgccca aaaccatcga acaactaaga 

6091 gaaaacggaa aacaaaccga tgccgtagtg atcgaaacaa tccaaaagtc acgtgatatc actatggacg 

6161 acatcatgga cggtaaatca aagaaaccga catctaatga ttggggcgag tgtgctacac gcattgtaag 

6231 cattcaccgt tacacctcta aattacaaga acactaccaa tttcatcttg ctataagogg acacgagggc 

6301 actaacaaag acaaagacga tgagggaagc accaccaatc caacaaccac gatagaggca caagaocaaa 

6371 taaaaaaage agtcatcagt caatctgacg tgttagcaag aatgacaaca gaagaacatg agcaagacgg 

6441 cgaaaaaac't tatcaaeatg tacttaacgc cgaaccacca aattcarccg agacaaagat aagacaccca 

6511 agcaacatca aaaecaacaa caaacgtttc attaacccaa gtattaacga tgttgtacaa gcaattagaa 

6581 acggtaatta aaaattaatt aaaaggacgg tataaaaatt atgaaaacca ctggtagaac acaatacatt 

6651 caagaaacta atcaagaggc atccacgaaa ggcggggacc ttttaggagc tggagaattt acagraaaag 

6721 ttgcaaatgt cgagcccaac gacagagaaa acagatactt cacgattgtc tttgaaaaca acgaaggtaa 

6791 acaatacaaa cacaaccaac ccgtcccacc attccaaeaa gattatcaag aaaaacaata catcgagtta 

6861 cetagcagat taggaattaa attgaactta ccagatttaa cttctgacac agaccaacca attaacaaaa 

6931 ecggaactat tgtacttaaa aataaactta acgaggaaca aggcaagtat ttcgtaagac tctcatatgc 

7001 aaaagtttgg aataaagacg atgaagtagc taataaacca gaacctaaaa ctgatgagat gaaaeaaaaa 

7071 gaacagcaag caaacggtaa acagacaccc atgagtcaac aatcaaaccc attcgctaat gctaatggte 

7141 caatagaaat caatgatgat gatttaccge tctaggacgt ggtttaaatg caatacaera caagatacca 

7211 gaaagacaat gaeggcactt attccgtcgt tgetactggt gttgaacccg aacaaagtca cattgatcta 

7281 ctagaaaacg gacatccgcc aaaagcagaa gtagaggttc cggacaataa aaaactatct atagaacaac 

7351 gcaaaaaaat attcgcaacg tgtagagata tagaacttco ctggggcgaa ccagtagaat caactagaaa 

7421 attattacaa acagaattgg aaattatgaa aggttatgaa gaaatcagtc tgcgtgaccg ctcaatgaaa 

7491 grcgcgagag agttaataga actgattata tcgtctatgt ttcatoatca aatacctatg agtgtagaaa 

7561. egagtaagcc grtaagcgaa gataaagcgc tattatatcg ggctacaatc aaccgcaact gtgcaatatg 

7631 cggaaagcct cacgcagacc tggcacatta cgaageagtc ggcagaggta tgaacagaaa caagatgaat 

7701 cactacgaca aacatgtgtc agcaccgtgc agacaacatc ataatgaaca gcacgcaatt ggtgttaagt 

7771 cgtttgatga taaatatcaa ctgcatgacc cgtggataaa agttgatgag aggctcaata aaatgtcgaa 

7841 aggagagaaa aatgaataag ctactaatag atgactatcc gatacaagta tcaccgaaac tagctgaact 

7911 aatagggtta aacgaagcoa tagtattgca acaaattcac tattggctaa aeaactcaaa acataaatac 

7981 gatggcaaaa ctcggatttt taattcttat ccagaacggc aaaaacaatt tccattttgg agegagagaa 

8051 ctataaaaag gacatccggg agtttagaaa aacaaaatcc attgcatgta ggtaaccaca acaaggccgg 

8121 acttgaccgc acaaaatggt attcaatcaa ztatgaaaca ttaaacaaac tagtggcacg accatcggga 

8191 caaaatggcc cgaegacgag gacaaattgg cacgatgcaa gaggacaaaa tgaeeegaee aataecaeag 

8261 accacacaga gaccaacaaa catagagaga cagacgacgc ctcaaagcca cttaagcaca ttagcaccaa 

8331 cttagaaatc atacaaaacc ctttaaaagc agaacagcca gaacacgaaa ttaaatcatt taagcaagat 

84 ci cagtccgaaa tagcaaaagc cgctaccgac tactgcaaag aaaacaacaa aggcctgaat tacetactaa 

8471 ctgtattaaa gaaccggaae aaagaaggcg cttcagataa agaaagtgcc gaaaacaaat cgaaaccccg 

8541 caactctaaa aaagaaaeta ctgatgatgt cacagcacaa atggaaaaag aattgagtga tgactaatgc 

8611 cgaegagcaa aacacaagca ttagaaatca ceaaaaaagc taggtaogta tacaaeatcg attttgataa 

8681 accaaagtta gaaatgtgga ttgatgxatz aagccaaaac ggggatCatc aaccaactgt aaaagctgta 

8751 gatggatata tcaacagtaa caacccgcac ccgcctaacc taccagcaat catgcgtaag gcaoctaaaa 

6821 aagcatctat tgagccggta gacaaogaaa ccgctacaca ccaacggaaa atgcagaatg accccgaaca 

8691 cgtcagacaa agaaaaatag cgctagataa cttcatgaac aagtcggcag aactcggggg cgataacgaa 

8961 tgaattacgg tcaatttgaa attgaaagca caataatcgc tacgctaccc aaacaaccgg acgtactaga 

9031 aaagataaga gttaaagatc acatgtttoc gaacgaaaag tttaaaacct ttttcaatea tgtaacggac 

9101 gtcggaaaga tagatcacca agaaatctat ctaaaagcaa ctaaagataa agagtcttta gacgcagata 

9171 ctataactaa accteacaac tccgattcca ttggatacgg attetttgaa cgccatcaac aagaattatt 

9241 ggaaagttat caaaccaaca aagcgaaaga atcggtaaet gagttcaaac aacaaeetac gaaceaaaat 

9311 cttaataacc tgattgatga actcaaggac tcaaaaacaa ttactaacag aaaagaagac ggaaccaaga 

9381 agtttgttga ggagtttgtc gatgagttat acagcgatag ccctaagaag caaattaaga cgggttataa 

9451 gctcatggac tacaaaatag ggggactgga gccgtcgcaa ctaatogtca tcgcagcgcg tcccteagtg 

9521 ggtaagacag gttttgcatt aaacatgatg ccgaaeatag cacaaaatgg atacaaaaca tctttctcta 

9591 gtcecgaaac aactggcaca ceagtattga aacgtatgtc atcaacaatc actggtattg agtcaacaaa, 

9661 gataaaagaa atcaggaacc caacgccgga cgacttaaca aagttaacga atgcgatgga taaaatcatg 

9731 aaattaggca ccgacatttc cgataaaagc aatatcacac cgcaagatgc gcgagogcaa gcaacgaggc 

9601 attcagacag gcaacaagtc atttttatag attatcttca accgacggac accgacgcga aagtcgacag 

9871 acgtgtagca gtagaaaaga tatcacgtga cctaaagata atcgctaacg agacaggcgc aatcatcgta 

9941 ctactttcac aactgaatcg cggtgtcgag cccagacagg ataaaagacc aatgctatog gaoatgaaag 

10011 aaccaggcgg aatagaagca gatgcgagtt tagcgacgct actttacegt gaCgattatt ataaccgtga 



WO 0002825 



PCT/IB99/02040 



10 



15 



20 



25 



30 



35 



40 



45 



50 



10C81 
10151 
10221 
10291 
10361 
10431 
10501 
10571 
10641 
10711 
10781 
lOflSl 
10S21 
10991 
11061 
11131 
11201 
11271 
11341 
11411 
11481 
11551 
11621 
11691 
11761 
11831 
11901 
11971 
12041 
12111 
12181 
122S1 
12321 
12391 
12461 
13531 
12601 
12671 
12741 
12811 
12881 
12951 
13021 
13091 
13161 
13231 
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14351 
14421 
14491 
14561 
14631 
14701 
14771 
14B41 
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149B1 
15051 
15121 
15191 
15261 
15331 



cgaagatgac 
ggaataattg 
cttattgaaa 
ggttgggcgg 
agaaaatcat 
atattactta 
gtttattcgc 
ctaaaggcgc 
gtggcacaat 
gtgcgaatat 
aaatttgaat 



gatattcaga 
gaatggatgg 
caacaacaag 
atgatgtgga 
tgacaacata 
aatttaaggg 
gatcaactat 
agatgcteta 
attaaaaaag 
taccga&cat 
ggtggccggg 
cgttacccaa 
actacgtaag 
gtcacttata 
tatgaacaaa 
tttatcgaac 
egtctagagc 
tgacetgtgg 



tgtgattgaa 
cctaattgtt 
ccctaagaga 
tttcacggac 
catatataaa 
ataatggcaa 
ccgaacaagt 
tttttcaact 



agcatcactg 
aatctgagca 
tcgatgtacg 
tcaatagatt 
gaacgaaacc 
caccgagaag 
tcacaggagc 
tcacgggctt 
gagcaaatac 
caccaatact 
cacaacctaa 
actggttgaa 
catcagtata 
tatatgagga 
catacataaa 
raaagaaaaa 
acaataagac 
caaccatggc 
tagtggatat 
ggtgttaatg 
taaatgtaca 
gaaaatgaat 
cgttgttaga 
aatcatggag 
aagaagccac 
accaaatgtt 
acgcaagaca 
aggtcacggc 
accgttaaag 
gagggttaac 
agaccgtgaa 
aactataaga 



gcaaacctat 
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tgttgaatgt 
actcagaggc 



tatctgtatc 
attacaaaac 
gcaacatgat 



Catatgaaaa 
taaaactatg 
gtatcaagtt 



gt« 

tggagctgaa 
gtttttaaag 
gaaaacgtaa 
ctactgcttc 
ggcaaaagca 
ctctagggca 
gtgcgtacaa 
gtccgacgaa 
gaagcagtag 
gtaggagata 
gagatacaca 
cceaatcaag 
aataaaaatt 
ttgaaatctt 
ttggattaat 
cctgaatccc 
atetaaacga 
t gat ate gat 
taacagtaga 
ctcaaacgac 
aaetggaaga 
gtctgcaatt 
gattgaaaat 
gaacaatttg 
ctcaccatac 
tggaacagca 
gtgacgcaac 
gagacaacca 
agctaagata 
agtagetagc 
ggcgataata 



tgaaagtaaa 
gatgggcatg 
tcactgaaga 
cgattcaatc 
acattaatct 
ctttgcatct 
acgaacatgg 
cggaggcaat 
aatgttgact 
acgaacgcga 
cgaagagctg 
aaaactagag 
ctgcagatat 
aatggaggca 
cggtaatgac 
atacccgagg 
gggttatggc 
aagtatataa 
ctataaactg 
ggggacgagt 
caaagtcggt 
caatgaagaa 
cctaaatctc 
cggaggacga 
cgactggtgg 
agagtgctta 
tcaattaaaa 
acagtagcca 
aacaaccagg 
aactetgact 
gaagttactt 
taaaaggtat 
aatcgaccaa 
gaogcaggaa 
acctagtcac 
gacgttcaca 
aggagagacg 
aaegaaatat 
gagatatcga 



gccggacaac aacgagttat 
aaccggaaga aaacacacac 
atggcacgga agatattaag 
ccatttcage gacgaaaaga 
ctatatgagc aagagctagg 
aacgecaaga aagttgagta 
tatgaatggc 
caaagaccga 
ttaaaggtaa 
cttaacgtgg 
gtcagacgta 
gacacaagaa 
caaagcgctt 
agaggtggaa 
gtaagtgtcc 
aagctgagat 
atctaccaga 
aaatgagcat 
gaacggtcca 
gcaccttatg 
aagagcgtga 
tgtgcctcaa 
agcgaagcat 
aecggegcat 
cctcaaceag 
aggatttagc 
caaaaacaag 
acgaacaage 
ggafcgaagag 
acgatagcgc 
gcgagtggca 
gatgtttaaa ggcgcgaaaa 
ctggaatatg aggaacagaa 
atgacgctac 
cagatagaac 
aaaagtagta 
gatcgtctag 
cagacggtac 
ggagctggta 



attcgggaaa 
gttatagacg 
gagatgrgaa 
cttagtgaaa 
cgcaacaatt 
gacacgccgg 
atgcatatat 
tagagataac 
aaaaacggaa 
taagtgaact 
gaggtggaat 
tgataaaaaa 
agccttacat 
cgagagagca 
atccgtctaa 
caagaaatgg 
atgtcaaaca 
acagtatcca 
aatggtcatg 
gatggeaacg 
aaatacaeag 
caagcacatg 
atgaccatga 



ctttcacacg 
ccgatcgtag 
cgcctttcga 
taaggagtgt 
gccatcaagc 
aaattatgac 
actgcaagca 
caaaggaatc 
actaactatg 
ttacgtatat 
ggcgactgaa 
acacgcaaag 



cacctacaga 
ogatgacaac 
caaacgaagc 
aagatctgca 
ggccaactta 
aatcacaaca 
cgtaaagatt 
actccaggaa 
gtatgcgctt 
aatggttagg 
aaaeaccctc 
aatgagcata 
tacaeatacg 
cattcgcaac 
aaaggcgaag 
ttgattacgt 
taaagctatg 
ctatataacg 



aacataacgc 
caaatgtaca 

tgattaegac 
taaaaaatgc 
ataaagacaa 
tgatagtgac 
acgatatttg 
gtatttgata 
atcgtatcga 
agccgatttc 
gttgccaaca 
cgcctaaata 
agaaacgaag 
agetgaatat 
ccggacgaat 
tgaacgaagt 
caatgaaact 
aatgtattag 
cgcacaaatt 
aacggtaaac 
tggttgcaaa 
agctgagtat 
caacgacgta 
gtgacccgta 
accagcaaca 
geaacattga 
aggtaatgca 
ttttacgtcc 
aatgtcatta 
agtcataaag 
aatgcatgtc 
tacttaaacc 



cggcgaaacc 
aattcaaaag 

catcgagacc 
agagttgaaa 



catatccatg 
agacatttga 



gtgtttcaat 
ggttttatac 
cactgagttc 
agagagttga 



tcgaaccggt 
gggcacgctt 



aaatgtgatt 
tattaaaact 
ataatgatac 
ctateaetga 
tgagtacaag 
gagaacatgt 
aaccacaccc 
tacactgtcg 
gacacaaatg 
acagaggggt 
gaatgacaaa 
aggtcaagtt 
ccacgaaaca 
cccgatgatt 
gagtggaatg 
acacgtaaca 
gtgttattaa 
ataagacgcc 
gtaaatgctt 
tacgagctag 
aacttattga 
gaacttttac 
gtatgattat 
taagccatta 
attgttgatg 
taccraaact 
tgataatagt 
ctcattgacg 
aaggatacgc 
aacattcaaa 
gttgttgagg 
gagatgccaa 
cagaactatt 
laga 



gattcgcaaa 
aaccaacaag 



oagcaacgag 
ggcgcacagt 
cgagagcaaa 
cgaatggaag 
aataaecgtg 
tactaaaaga 
tgctattcaa 
gtctataaat 
aacgaaagtc 
cagttatcag 
gaaacgacta 
aaagataatg 
ctggattttt 
tagatattac 
gaaatcatcg 
attattggtt 
ggaattaaaa 
aaatacaaaa 
tgaattegcg 
gatacacaat 
aagaagattt 
aeacccagtt 
atcgcacaag 
catacaaaaa 
gtaaagacat 
gatccaacag 
cggagagtaa 
agaaaacggt 
atatgaatat 
gacgcattaa 



gtatgatgca 
gaaacattat 
aagctcccgt 
agaagaatta 
gcaataagac 
atgagcttat 
caaaaatgaa 
gcgaacagga 
aactagacgg 
aacaaacaga 
gatcgaggae 
cattcagtta 
acgaggagga 
tagagactgt 
eattacagac 
aaaaatgtgg 
tgaaagaggt 
atcatttcaa 
tttgctaacg 
accaaogtga 
gaacaaacgc 
cgacaggaga 
ggagetcgac 
gagtgggtta 
tagatgagat 
ggaagagact 
catcctgccc 
tcttaacaac 
gaaaatgaaa 
cttagatcga 
gacaaccaca 
agaaggagcg 
aacgattgat 
gacagtgagt 
aacaagccat 



tactgctgaa 
ttactttaga 
ggaatgggca 
ggagaatgta 
acatttttac 
cgacattgaa 
ctttatatac 
aaectataaa 
ttcgtgaaaa 
tcgaaactgc 
attaatttca 
actttagagc 
cgaggataca 
ttgcttggaa 
taggagctct 
aacaaatgag 
acaatcagcg 
gtgtataaga 
aagaaggtat 



cgtaaacgaa 
ctaaaacgat 
atatttagag 
gcaaagtaga 
aatacaaccg 
tctttgtgga 
tcaaagogaa 
cacaggtcaa 
cgacccaatg 
cagcaceaog 
tactaaagta 
atccgcaact 
ggcgtatcaa 
acaaattggc 
agaggattgg 
catataaatt 
aagagtacga 
aaagaaactg 
aagaggccga 
ctggtccgac 
gaaaagtaga 
aattatagat 
atcaaatact 
aaagagcttt 
caggaacaat 
aagttagcaa 
gtttggtctg 
aacttttcag 
cacttattac 
gagctcgaaa 



tggaacaatc 



tgcagaaatc 
ggccttgaaa 
taaacgaaga 
gtatgtcttt 
gtaogaatga 
aagcatagaa 



gcaaccggcc 
tctatcacca 
tgccgcaccc 



tcaatcaaaa 
gctaagttaa 
gatacgaaca 
ctatatagga 
ctctacgaat 
atcagracca 
aagcgcaagc 
tgaacttgat 
gagcatcagc 



aggataccta 

agcacgagac 
tgcattgcta 
aagxcagaaa 
Caacgaacaa 
gacagaacac 
acacactcga 
tgccgactac 
actgaggcca 
atgcaacgea 
gcctttctcg 
aggaaocacg 
gtcaaggagg 
cgaacacttt 
aaagagaagr 
gtagacgaaa 
taatgtcagc 
aeaaattacc 



aaccaattta 

agttatcaaa 

ttaggtttct 

tttacattat 

cgaaagatac 

gaagaaaagt 

atgaacagat 

atgaataatt 

cgaceagaaa 

gtttcttaaa 

ttagccctca 

cggtcgattt 

taeaetaaea 

taogccaata 

aaagacaaga 

ttttggggaa " 

actgctgcta 

acgagaaaca 

acctattagt 

tgatgaagat 

gacaaactaa 
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15431 catgtegagg aggcagacga tgattaacat acctuutg aaattcccga aaaagtacac cgaaataatc 

15471 aagaaacaca aaaataaaac acctgaagaa aaagctaaga eegaagatga cctcactaaa gaaactaatg 

15541 ataaagacag cgaatettac agtcctatga tggctaatat gaacgaacat gaattaaggg ctatgttaag 

15511 aatgatgcct agtttaattg atactggaga tggcaatgat gattaaaaaa cctaaaaata tggattggtc 

15681 cgatatcctt attgctggaa tactgcgatt attcggcgta accgcactga tgcctgttgt catatcgcct 

15751 acctatacag cggetagtta ccaaaaeaaa gaagtatatc aagggacaat cacagacaaa cataacaaga 

15621 gacaagacaa agaagacaag ttccatattg tgttagacaa caageaagtc atcgaaaact ctgacteact 

15891 attcaaaaag aaatetgata gcgcagacat acaagctagg ttaaaagcag gcgacaaagt agaagttaaa 

15961 acgactggtt atagaataca ctttttaaat ttatacccgg tcttatacga agtaaagaag gtagataaat 

16G31 aatgattaaa caaatattaa gactattatt cctactageg atgtatgagc caggtaagta tgtaactgag 

16101 aaagtatata ttatgacgac ggctaatgat gacgtagagg cgccgagtga crtcgcaaag ttgagcgatc 

16171 agtctgattt gatgagggcg gaggtgtcag agtagatgta tagcaaagag ccaattgtca atatgatagg 

16341 cacacataaa atgaagtgta acgtattagc tgargtaaca ccggaatatg atagcaattc aattgcacag 

16311 catggcAtac aegcaacgtt gccgaaacca caaggggaaa actcaagtaa agtcgaagac gttgttgtga 

163B1 ggctcgagag agcaaataaa aggtatgctc agatgttaaa agaggttgag tttataaatc aatcgcaaca 

16451 gagattggga cacgttgact tctgcttctc agagttattg aagaaaggtt acaacaggga tgcgattatc 

16521 aagaagatgc ctaactctaa actaaataga aacaacttct tagcgcgccg tgatgagtta gcagaaaaga 

16591 tttatctacc acagtgaoga aaatgacaaa aatgacagaa acgacgaaaa tgacactatt tttaaactgt 

16661 gaattaattt tacacaattg atttgtaaga attatcttaa gaegtggggt aatagccaca ttagatgttc 

16731 ccatcgatgt gattgagaag tgacaaacac acaaaagatg atacgttacg ctattaatca cctactacct 

16B01 gcctatatgg tgggtagttt aatccttgca ctttgagtca caactatttc cctcctttca catttattga 

16871 acgtagctcc cgcacaagat gtaggggcat tetttaeact caaataacta gagtaattaa cgtaaaggcg 

16941 tgtgatacag cgaaaacaac tgattaaact aacaccgaag caagaaaagt ttgtgccagg actcatagag 

17011 ggcaagagcc aacggaaagc atatattgac gcagggtate cgactaaagg taagagtggg gaatatceag 

17081 acaaagaagc gagtacactt ettaaaaacc ggaaggttcc cggaaggtac gaaaaaetgc gtcaagaagt 

17151 agctgaacaa ccaaaatgga cacgccaaaa ggcctttgaa gaacatgagt ggctaaagaa tgtagceaag 

17221 aatgacattg aaacagaggg agtgaagaaa gcgacagccg atgcatecct cgctagttta gatggtacga 

17291 acagaatgac grtaggtaac gaagttttag ctaaaaagaa aacagaaact gaaattaaga tgcttgagaa 

17361 gaagaccgaa caaacagata aaggtgacag tggaacagaa gataaaatca aacaacttca cgacgcaata 

17431 acggaagega tcgtcaatga acaaacccaa atctttatat acggacaaac aaattgaaat accgaagcaa 

17501 acgcaaaaac aagactggct tatgttaatt aatcacggag caaagcgtac aggtaaaaca atattaaaca 

17571 acgacttatt tttacgtgag ttaatgcgtg egcgaaagat agcagacgaa gaaggaattg agacacctca 

17641 aeatatactt gctggtgcaa cattaggtac gattcaaaaa aacgtactaa tagagtcaac taaeaaatat 

17711 ggcattgagt ttaattttga taaatacaat ccattcatgt tatttggcgt tcaagtggtt cagacaggtc 

17781 acagtaaagt aagcggtata ggagccacac gtggtatgac accgtttggt gcataratca acgaagcgtc 

17851 gttagcgcat gaagaggtgt ttgacgagat caagtcacgt cgtagtggaa ctggtgcaag aatattggta 

17921 gataccaacc ccgaccatcc cgagcactgg tcgttgaaag attacattga aaatacagat cctaaagcag 

17991 gtatactgag tcaccaattt aagctcgatg acaataactt tcttaatgat agacataaag agtctattaa 

18061 ggcttcaaca ccatcaggta tgttccatga acgtaatatc aacggcatgt gggtgtetgg cgacggtgca 

18131 gtatatgccg actttgattc gaatgagaat acgattaaag cagatgaact ggacgacata cctatcaaag 

18201 aacactttgc tggtgtcgae cggggttacg agcactatgg atceattgtg ttaataggac gaggtataga 

18271 tggtaaccct tattttattg aggagcacgc acaccaatct aagtttattg atgattgggt ggttattgca 

18341 aaagatattg taagtagata tggcaatatt aaectttaec gcgatactgc acgacctgaa tacatcactg 

18411 aactcagaag acatagatta cgtgcaatta acgctgataa aagtaaacta tcgggtgtgg agga&gttgc 

18481 taagttgttc aaacaaaaca agttacttgt tctttatgat aatatggata ggtttaagca agaggtattt 

18551 aaatatgttt ggcaccctac aaacggagag cctataaaag aatctgacga cgtgttggac tcgttaagat 

18621 atgccatata cacacacacc aaacctgaac gatcaaggag ggggaaatga catcgtataa gttaatagat 

18691 gatattgaag cacaaggaat attgcctaag catactgagg ctctaataga gtcacataaa gacgatagag 

18761 agagaatggt taatctctat aatagacaca agacacacat tgactatgta ccaatarcca aaegtcgacc 

18831 aattgaagaa aaagaagatt ttgaaactgg tggaaatgta aggcgaccag acgtgtctgt taataacaaa 

18901 cctaacaact ctttcgacag cgaaattgtt gatacacgtg ttggttatct acacggtgtc cctgtcactt 

18971 atgaectaga tgaaaacgca gaaaaaaacg aaaagtegaa aaagttcaca accaacttcg ccattagaaa 

19041 cagtgtcgat gatgaggatt ctgaaacagg taaaetggca gcaatttgeg gatatggtgc taggtcagca 

19111 tatattgata cgaatggtga cattaggatt aagaatatag atccccataa tgttatttcc gtcggcgaca 

191B1 atattttaga acctacatac tcatcgcgct acttttatga aaaagatgat gataatggca ctgattatgt 

19251 gtacgcagag tcctacgata atgcttatta ecatgtatcc cgaggagaag gtattgacgc ttcgcaagaa 

19321 gttggacgac acgaacatct atttgatcac aatccattgt ttggtgtacc raacaacaaa gagacgatag 

19391 gagacgccga aaaggttatt cacttaactg acgcatatga tttaacaatg agcgatgcat caagtgagat 

19461 cagtcagaea cgtttagcat acctcgrgtt acgcggcatg ggtaegagtg aagaaatgat tcaagaaeca 

19531 caaaagagtg gcgcatccga gttgttcgae aaagacatgg acgttaaata cttaacaaaa gatgtaaaCg 

19601 acacaatgat tgagaaccat ttagatcgaa ccgaaaagaa tatcatgcgt tttgcaaagt cagtaaactt 

19<71 taattctgac gagctcaacg gaaatgtacc caccattgga atgaaaetca aacttatggc cttagagaac 

19741 aagcgtacga cgcttgagcg taagatgaca gctatgttga ggtatcaatt oaaagttatt teacctgcat 

19811 taaagcgtaa agggtacaac ttggatgacg atagttattt aaacctgata tttaagttca ctcgtaacat 

19881 tccagttaat aagtcagaag aatcacaagt getaatraae ctgaagggac aegrttcaga acgaacaagg 

19951 ctaggacaac cacaactagt tgacgacgtt gateacgaat tagaogaaat ggaaaaagaa agtcctga&c 

20021 ctaatgacaa attacctgac atagacgaag gtgacgcaaa cgacaaatcc caaaataacc aaccagaatg 

20091 atattgatga gtatatcgag ggcttaatct etaaagcaga aaaaccaata gaacaactat ttgctaatcg 

20161 acttaaagag ataaaacaaa ccatcgcaga tatgtctgag aaatatcaaa acgatgargc gtatgttaca _ - — 

20231 tggactgaat ccaacaaaca caaeaggctc aataaggagt taactcgtat aggcacaatg tcgaottatg, 

20301 actacaggca agcagctaag atgattcaga agtcacaaga agatgcttat atagaaaaat tccttatgag 

20371 cctttattta tatgaaatgg cgagtcaaac acctatgcag tttgatgttc cgagcaaaga ggtaatcaaa 

20441 tcagctattg aacaacccat egagttcatt cgtttaatgc caacaccaca aaaacatcgt gatgaagtat 

2D511 tgaaaaagat acgtatgcac attacacaag gtattatgag tggagagggt taccccaaga tagctaaagc 

20581 aatacgtgac gatgtcggca tgtctaaagc tcaatcactg cgtgtggccc gtacagaagc aggcagagca 

20651 atgtcacaag ccggacctga tagcgcaatg gttgctaaag ataacggttc gaacatgaag aaacgttggc 



WO 00/32825 PCT/1 699/02040 



10 



15 



20 



25 



30 



35 



40 



45 



50 



194 

20731 acgctactaa agatacacga acacgtgata cccaccgtca tttagatggg gaatcagtgg aaatagatca 
20791 gaatcttaaa tcaagtgggt gtgttgggco ggcgcceaag ctaettattg gtgtaaacag cgegaaagag 
20861 aatactaatt gccgttgcaa attaccttat tatattgatg aaaacgaatt gccaactgca acgagagcac 

20931 gtaaagacga cggtaaaaat gaagttatcc cattcatgac ttatcgtgag tgggagaaat ataagcgaaa 
21001 aggtggtaac cgatatggac tttaaaataa aagtaaatgt tgatactggc gaagctatag aaaagtcaga 
21071 acgcatcaaa tccttgtacg aagagataat agagttacaa aacgaaaaag ttgttgtaaa cgtaacagtt 
21141 aaaaatgaag ctgatttaga catggttaaa acatctatca gcgaagaaaa cgctaaaaac aatgatttca 
21211 cactttttta gttgtctctt tgctactcga ccttagcatg tcgttaaact get ttt tact atgcactttt 

21281 oggactgtta gggtacgega agggcaaaaa ggagttttga tatatgaata tcgaagaagt taagtctttt 
2 13 SI tttgaagaac acaaagacga taaagaagta aaagattatc taaagggact taagacggtg tctgttgatg 

21421 aegttaaagg etctttagat acagaagaag gtaaacgact cattcaacct gaattagatc gttatcattc 

21491 gaaaggatca gaatcatgga aagagaaaaa tctcgaggat ctaaecgaac aagaagtacg gaagcgcaat 

21561 cctgagcaac cagaagaaoa aaaaegtatt agtgctcttg aacaagagtc agaaaaaege gaegcagagg 

21631 caaaacgtga gaagttaaga agtaacgegc taggtaaage gcaggaacta aatctaccaa catccttagt 

21701 tgatagattt ecaggegatt ctgatgaaga taccgageaa aaettaaaag ctttaaaaga aacctttgac 

21771 aagtatgtcc aaaaaggcgt cgagtctaaa tttaaaccga gtggaagaga tgttaaagaa tcacgaaatc 

21841 aagatttaga ecctfccaaat gtaaagtcca ttgaagaaat ggcgaaagaa atcaatatta gaaaataaag 

21911 tgaggtaata aaatatggca actccaacat acacgccagg caatgttatt ttateggatt ttaaaaaegg 

21981 cgttattcca gcagaacaag gractttaat catgaaagac accatggcea attcagcaat tatgaaatta 

220S1 gctaaaaatg agecaatgae agcacaaaag aaaaaattta cttactcagc aaaaggtgta ggcgcctact 

22121 gggtatcaga aacggaacgt attcaaactt etaagectga atatgegcaa geagaaaegg aagctaagaa 

22191 aatcggtgta attattccgt taccaaaaga gtttcteaaa tggactgcaa aagatttctt caatgaggtt 

22261 aaacctctaa ctgeagagge atcttacaaa gcgtctgacc aagctgttat ctttggtact aaatcacctc 

22331 acaacacttc aactagtggt aaaccgctcg ctgaaggego agaagagaaa ggtaacgttg ttacagatac 

22401 taataattta cacgtagacc ttteggcact aatggctacc atcgaagatg aagagttaga tccaaaogga 

22471 gtaccaacta cacgttcatc cagaagtaaa aegegtaatg ctccagatgc taatgacaga ccattactcg 

22541 atgetaaegg gaacgagaet atgggattac cactatctta tactggagcg gaegtatacg acaaaaagaa 

22611 ategctagea ccaatgggtg attgggacta cgcacgttac ggtatcttac aaggtactga gtatgeaatt 

22661 tccgaagatg ccaegttaac gacgttacaa gcatcagatg cttctggcca accagtatca ttatttgaac 

22751 gtgatatgtt cgctccacgt gcgacgacgc atattgeata cacgaaegtt aaaccagaag cgtccgcaac 

22821 gcttaaacca actgaatagg aggagatatg atggctaatc ctgeagaaga gattaaggta aaaaaagaca 

23891 atatgactat tactgttaca aagaaggcac ttgactctta ttaeagtctt gteggttaca aagaggttaa 

22961 ateaegtegt actaegtctg ataagagega gtgataaaaa tgactcttta tgaagatgtc aaacttttac 

23031 ccaagaaaaa tggagtggaa gttaaaagtg atgaagaaga aatatttaag atggaagttg aoggaatact 

23101 agaagatgtt agggatataa caaaeaatga ccttatgaaa gaeggtcaag ccatttatcc ttactcaatc 

23171 aaaaagtatg tegcagatgt cctagagtat tatcaacgac ccgaagttaa aaagaattta aagtcaagaa 

23241 gcatggggac agtgtcgtac acttataacg atggtgtccc tgattacatt agtggagtat taaacaggta 

23311 taaacgagca aagtttcatc cgtttaaacc aacaaggrag aggtgttgtt tgtgtttaac ccatacgacg 

23381 aactccctca caccatttct actggaagca tcaaaaaagt aggagagtac ecaattatac aagagegett 

234S1 tgtaagegae aaaacaatta aaggacctat ggatacgect accacatctg aacaactaaa atttcatcaa 

23521 atgccacaag aatatgacag aaacctatat gtacctcatg actcgccaat acctaaaaac aatttatttg 

2 3591 agtatgaggg cagaatcttt agtattgaag gtgattctgt agaccagggc ggacaacatg aaattaagtt 

23661 actaogactc aagcaggtgc cataeggcaa aagttaagta eggegctgat agcatggttg tcgaattgga 

23731 taagttcgat aagaaaatag aagagtgggt taaaaaaggt attgetaaaa caacgacgaa gatttacaac 

23801 actgetgtag cattagctcc tgtcgactta ggttttttag aagaaagtat tgactttaaa taettcgatg 

23B71 gtgggttatc cagtgttata agtgtcggcg cagattatgc aatacacgtt gaatacggta ctggtataca 

23941 tgctactggt cctggtggta gtcgtgetac aaagattccg tggagtctta aaggtgatga eggegaatgg 

24011 tacaccacat atggtcaagc gccacagcca ttttggaacc ctgcaattga cgcaggacgc aagacattcg 

24081 agcagtattt trcatagagg tggttaaata tgtgggtatc agttgagcct gaacttacaa atcaaatata 

24151 taaaagatta atetcagacc ctaacaccaa caaactagtt gatgataggg tttttgaegt tgttcaagat 

24221 gacgctgttt acceatatat tgttgtgggc gaatcaaacg tcactaacaa cgaatctagc gcaacaatga 

24291 gagaaacagt cggtattgtc atacatgtgt attcacagtt cgctacacaa tacgaggeca agctcatttt 

24361 aagegegaca ggttatgtgc ctaacagacc tatagaaata gataattacg agtttcaatt tagcegtate 

24431 gatagtcaag cagtattccc cgatatagac aggcttacta agcatggcac gataeggett ttatttaagt 

24501 acagacacaa aaagaaaaac gaaggagtgt actaaaegge gcaaaaaaac tatttagcag ttgtacgtcc 

24571 agctgaaacc gacttagatc cagtagaatc cttattatta gctgacttac aagaaggtgg acatacgatt 

24641 gaaaatgatt tagctgaaat agtacgaggc ggtaaaaegg actattctcc caatgeaacg teagaatcat 

24711 ttaaattaac aattggtaat gcgcctggag ataaaggaat tgaagcagtg aaacacgctg taoaaacagg 

24781 tggacagttg egtatatggc tttatgagcg eaataaaegt geagaeggta aacatcaegg aatgtttggt 

24851 tatgttgttc cagaatcatt tgaaatgtca bttgatgatg aaagtgacaa aatcgaacta tcattaaaag 

24921 ttaaatggaa cacagcagaa ggtgctgaag ataacttgee gaaagagtgg tttgaagccg caggtgcgcc 

24991 cacagttgaa taegaaaaac teggegaaaa ageeggaaca tccgagaatc aaaagaaagc tagtgctgta 

25061 tctgatteac acaeggaaga ccattctatg taaactaata gatcaagggg gegtaagetc cctattttt t 

25131 tataaaaaaa ttgaaaagag gtatatattt tgactgaaCc taacccaatt acaacattaa aaatcaacga 

25201 eggagaaaaa gactacgaag cagaagcaaa ageaacattt gcattcgacc gaaaagctga aaaattctca 

25271 gaagatagcg aagatgggag aaaaggagca aegecaggat tcaatgtcat ctttaaeggt ttgctagaat 

25341 etagaaacaa agegatctta caatcctggg aatgtgctac cgcttactta aaaaacccac caactcgaga 

25411 acaattagaa aaagcaattg acgatttcac cactgaaaac gaggacactt tgeegttatt acaaggggct 

25461 ttggacaaac ccaacaatag eggtttttte aagagggaga grcgcccgta ctggatgaca ctgaacaaag 

25551 cacogaatat ggecaaaage gaggaeaaag aaatgaegaa agcaggcata gaaatgatga aagagaatta _ 

25621 caaggaaatc acgggcgcag aaccctacac gattacccaa aaataaggca actgacagct agatatttag ** 

25691 gatataeccc tgaacatgaa ttgttagcac taacacctgc tgaacggcgt gattggctta tcggtggtca 

25761 ggataggtac ctagatcaaa gacaattatt aattgaacaa gegcaagcta aeggcttagt acaagcttct 

25 831 aagaggctaa ctagtatgac tegtgacatt gagaaacaac gttaogaaat aagagaacce ggcagctatg 

25901 ctcgtgtaca aaaagctaga ttagaagaag aaaaaagaag aegtgaaetc ttcaaagaag gtacaagaaa 

25971 acccetcgaa tcgaaaggag gttagccztt ggatacccat tctatggcaa agate atggc caatattaga 
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26041 gatctccaaa gcoacgtaag gaaagctcaa cgactagcaa agacgtctgt accaaacgaa attgaaacag 

26111 atgtaaaagc agacatttca agactccaaa gagctctaca acgcgctaaa tcaatggetc aacgatggcg 

261B1 agagcattct gttaaattat ccatgaaaac agatgagcac aaagcgaatc tagaacgcgc taaagcceaa 

26251 gtagagcgat ctaaacaaca taaagtagat ttgaaactaa gtaacactga ateaatggcc aaatataacg 

26321 caaccaaagc tactgtcgaa gcttggagaa aacatgtcgt taagctggat ttagacgcaa accccgccaa 

26391 aatggcggct aaagggttta aagaagattt aatagatctt agcaggcata gctctgatat tgactccagc 

26461 agacggaaac taggaaataa attcacaaaa gaattcaatg aagtcgaagg agcagttaaa cgttctttcg 

26S31 gaaga&ttgg teagaetatg agaaaagaag taaatggaac aagtgatate cggggtaaac ctaacaactc 

26601 actgaaagat taeggcgaga aaatggacgc cttagctact aaaatccgaa cttccggtae tatcttcgcg 

26671 caacaggrcca aaggcttaat gattgctagc atacaagcat tgataccagt gattgccgga ttagtacctg 

26741 caataatggc agtactcaac gcggttggtg tattaggcgg tggcgtttte ggtttagttg gcgcattctc 

26611 tgtcgcaggt cccggagttg ttggcttcgg tgc&atggct actagcgctc ttaaaatggt: tgaagatgga 

26681 acattggcag caacaaaaga agttcaaaac cccagagatg cgagcgatca gtcaaaaact acatggcgtg 

26951 atattgttaa agagaatcaa gcaagtatcc ttaatgcgat gccagcaggt accagaggcg tcacaagtgc 

27021 gatgtctcaa tcaaaaccac ccttatccga agtatctatg ctagttgaag caaacgcacg cgagtttgag 

27091 aatcgggtta aacattccga aacagctaag aaagcgtctg aagcattgaa tagcataggc ggcgcaatct 

27161 tcggagattt attgaacgct gcaggaogac ttggcgacgg attagttaac atcttcactc aattaatgcc 

27231 gttgttcaaa tttgtgtctc aaggactaca gaacatgtct atagctttcc aaaattgggc taatagtgta 

27301 gccggtcaga atgctattaa agcgtttatc gactacacta ccactaactt acctaagatt ggtcagatat 

27371 ttggtaatgt gtccgctggt atcggtaact caatgattgc ttttgcacaa aacagctcca acatttttga 

27441 ttggttggtt aaattaactt ctcaatttag agcatggtca gaacaagtag gacaaccaca agggtttaaa 

27511 gactttatca gttatgttca agagaatggt cctaccatta cgcagttaat cggcaatatc gtaaaagcat 

275B1 tagttgcttt tggtactgca atggctccca eagecagtaa attgctagac tttatcacta atctagctgg 

27651 acccaccgcc aaaccacccg aaacacaccc agctatagca caagetgctg gcgttatggg tattttaggc 

27721 ggtgrattcq gggctctaat ggctccgatt gttgctataa gtagcgtact cacaaatgtg tctggtttga 

27791 gcccacccag cgccactgaa aagattttag acttcgttag aacatcaagt ttegttactg gagctacgga 

27861 ageattaata ggtgcatccg gttcgatttc agcacctatc ttagcagecg ttgcagtaat tggtgcattc 

27931 attggcgtcc ccgttcatcc atggaaaaca aacgagaact ttagaaatac tattactgaa gcgtggaacg 

26001 gtgttaaaac ggcagtetcc ggtgcgattc aaggtgtagt cggctggrta actgaattgr ggggcaaaat 

28071 ceaatctacc ttacaaccga taatgcctat attgcaagta ttaggacaaa cattcatgca agttttaggt 

28141 gttttggtaa caggcatcac tacaaacgtt atgaatatca tacaaggttt gtggacttta attacaattg 

28211 egttccaagc cataggaaca gtgatatccg tagcagtcca aatcatagta ggtttgttca ctgccttaat 

28281 tcagtcgctc actggcgact tctcaggtgc ttgggagact attaaaacta oggttaccaa tgtgcttgat 

28351 aogatctggc aatacatgca atcagtttgg gagtcaatta tcggcttttt aactggcgta atgaatcgaa 

28421 cactttctat gtttggtaca agttggtcac agatatggag tacaatcact aattttgtta geagtatttg 

284 91 gaacactgtc acaagctggc tcagtcgagt ggcttegagt gtagctgaaa aaatggggca agcactaaac 

28561 cttattacca caaaaggttc tgaatgggtt tctaacattc ggaatacagt tacaagtttc gcgagtaaag 

26631 tagctgatgg gtttaaaaga gttgtctcaa atgtaggtga cggtatgagt gatgcacttg gtaagattaa 

26701 aagtttcctc agcgacctcc taaatgccgg agcggaacta atcggcaaag tagcegaggg tgtagccaaa 

28771 tctgcgcaca aagtagtcag cgcggtaggc gacgogattc catcagcttg ggacccegta actccattcg 

28841 taagtggaca cggtggaggt agtagcttag gtaaaggttt agcggtatca caagcaaaag taattgctac 

28911 agactttggc agcgccctta ataaagagct atcctcract ttgacagata gtatagtaaa ccccgtaagt 

28981 acttctatag acagacacat gactagcgat gttcaacata gcttaaaaga aaataataga cctattgtga 

29051 acgtaacgat cagaaatgag ggogaccttg atttaattaa atcacgcatt gacgacatga acgccacaga 

29121 cggaagtttc aacrtattat aagggaggtt tgttagttga tagcgcacga tatagaagta ataaggaatg 

29191 gttcacagta tcgcgtcagt gacaatcctt tcacttataa ccacttggaa gcagttgaat ataacgttac 

29261 aggcgcagga tatcatcgta actatcctga tatagagggt atcgatggta gatttcataa ttacgctaaa 

29331 gaagaactta aaaaagtaga gcttaagata aggtataaag tacctaaaat tgctcatgct tcacatttaa 

29401 agtcagacgt ccaagcacta tttgctggac gtttccattt aagggaatta gctacaccag acaatccaat 

29471 taagtacgag catatattag atataccaaa agacaaacaa gcatttgagc ttgattatgt tgatggacga 

29541 caacttttcg taggactagt aagcgaagtt tcttttgaca caacacaaac atcaggggaa ttttctttgc 

29611 cgtccgaaac aaccgaacca ccatactttg aaagtgtcgg ttatagtact gatcttgaaa gtaataacga - 

29681 ccctgaaaaa tggtcggtac ctgatagatt gcctacaaac gaaggtgata agaggogtca aatgacattt 

29751 tacaacacta actcaggaga agtttattat aacggtgatg ttcctttaac acagtttaat cagtctaatg 

29821 ttgtcgaaat agagttagct gaagatgtta aagotaatga eaaggatgga ttcacttccc atacagataa 

29891 aggaaatacc tcagttatta aggaagttga tttaaaagcc ggagataaaa taatcttcga cggraaacat 

29961 acctatagag gctatttaaa tatagattct tttaataaaa ctteagaaca accggtttta tatccaggct 

30031 ggaatcgact eaagtctaat aaagtaatga aacaaattac atetagacac aaatcatact ttagataagg 

30101 agtagcctat gccaatttta ttaaaaagtc tacagggtgt agggcacgct attaatgtta gtacaaaggt 

30171 aagtaaaaag ctaaatgaag atagttcttt ggatctaact attatcgaga ocgcgagtac gtCtgacgca 

30241 ataggtgcta taactaaaat gtggacgatc actcatgttg aaggtgaaga tgatttcaac gaatatgtaa 

30311 ctgccacacc cgacaagtcc actatcggcg aaaaaataag gcctgacacc aaagctaggc aaaaagaact 

30361 tgacgaccct aacaatccta ggatccacca agagcataac gaaagtttca caggcgttga gtccttcaac 

304 SI accgtcctta aaggaacggg crataagcat gtatcacatc caaaagtaga tgcatctaaa ctcgagggat 

30521 caggcaaagg agacacacga tcagaaatct ttaaaaaagg acctgagcgt tatcatctog aatatgaata 

30591 cgatgcaaag accaaaacgt ctcattcgca tgacgaacta cctaagttcg ccaattatta cattaaagct 

30661 ggtgtgaacg etgataacgc caaaatacaa gaagatgcat ctaaatgtca tacctttatt aaaggttatg 

30731 gtgattttga tggacaacag acttttgcag aagogggact acaaattgaa ttcactcacc cattagcaca 

30801 atcgataggt aaaagagaag cgccaccgct tgttgacgga cgtattaaaa aagaagatag tccaaaaaaa 

30871 gcaatggagt cattgataaa gaaaagtgcc accgccecta tttccttaga cttcgcagcg ttacgtgaac _ 

30941 accccccaga agctaaccct aaaacaggtg atgtcgteag agcggtggat tctgccatag gacacaacga 

31011 cttagcgaga atagtcgaaa tcaccacaca cagagacgcg cacaataata ccaccaagca agatgxagta 

31081 ttaggagacc ctacaaggcg taatogtcat aacaaagcag ttcatgatgc tgcaaattat gccaaaagcg 

31151 taaaatccac aaaatccgac ccacctaaag aactaaaagc actaaaogca aaagttaacg caagtttatc 

31221 cacaaacaac gaaccggtca agcagaacga aaaaataaac gctaaagtcg ataagacgaa tactaaaaca 

31291 gttacaactg ccoatggtac gatcatgcac gactttacca gtcoatcaag cataagaaac atcaaatcaa 
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aacaacttcc 
taetcgaagg 



cggcgactcc gtagctagag 
gctaaaacga ctaatcttgc 
aaaacagcat ttacagacaa 
ctggttacac ggttattggg 
gccttttgtc ctgcaactga 
caagacaacg ccctacgagt 
acttgaggac tatgtaaacg 
cacacagatt actttaagcc 
acgaaaaagg tcacgaggtt 
gcaaccaacg gctcacggat 
gagtataact atcgctcgtt 
aagaaataca cgcacactca 
ttcaaatagc cgttttagca 
cgtattgata atacaggtta 
atgctttcao taaaaaggtc 
attcgaacca aaagagcaag 
tcattttggg tagaccctag 
tatctagatt gaagcccaac 
caatgcgtat agatacattg 
aagtttgtac gtttccaata 
atatatttaa cgacagatat 
atacaaagct tccgaaagac 
gataaaggta tagacaaag? 
cgcaaggtat eacttatgat 
ctacttacaa ggtttogata 
aataataact ttaaaggaga 
gtaaageact tttaataggg 
ccaaagaggt gttaaccaat 
gttaaaccgt caccaacaca 
atacgcaaga cacacaaaat 
ggacgtactg cctggacact 
atgcctaaat tcgaacgtgt 
aaaacgccgg ccattgggaa 
agatctctat atcactactg 
ggttggatat eagaagtaaa 
cgtctgcaca tcaattctta 
aaaggtggcc gaataatgat 
cacaatataa tccaattatt 
ttttgcagto actaagaata 
aaaaccgatg attataacgt 
atgggcgttt gcagtatgcg 
ctctacacaa aacgggagea 
gttagtgggt ttgatggtat 
gtaaagactt taaccaatta 
tgcgacaaaa ggcattcaac 
actagtgcaa cacaagctgt 
gtgttaacga agttgaacaa 
aaagtctaaa cttacagatg 
agcgcagtca acacaectag 
gcacgttaga gaagcctgga 
aagcaaatct ggtgcgttag 
gaetcaaacg atgagtacac 
gaaacttaac taagcaattt 
tgataaabtc ggaacaaoga 
aactcaaata atgcgcaagg 
acccaccagg cagtgctgaa 
atttaacttc acgccttata 
cagtggacag cccctaacga 
caatcaatct aaccgaacca 
tattgaggga tecggaccaa 
gacggtaacg geggcggtat 
acgatgtgta ctttgattta 
aattatgggg tggaaataat 
gcggtttacg caatagttta 
tagaaagttc gttttcacca 
gcatcaaacc aacaaagtgc 
tgcagatgac gcaagtgaac 
gaccgaaceg aaaaccaaca 
tttaaagaca ttaaaacttc 
tgggtgtaac cgacaaagaa 
gtcacaggtg taaegcttga 
acggcaogaa catgaatggc 
gagattaaat taggtcaaaa 
agagggaaag acagacagae 
gattctcggt ctgataggga 
taaaggaggt gattaccatg 



l% 

ggtcgcacgc aaaaactaac ttcacagaaa egttaggcaa 
aagaggcggc gcaacaacgg caacagttcc aataggcaaa 
gcagagcaaa caagaggaga cctaaccaca ttacaaggca 
caggcgtacc gacaggcacc gataaaacgg atacaaaaac 
agttaccaga aagaataatc cagattcaaa aatactagcg 
ggtacaacaa tacgccgcaa agacacggac aaaaacaaac 
ctcaaatatt agcctgtagc gagttagacg caccagtgtt 
atacaaccca gcttccagga aagcgagcat ggaggacggc 
attatgtacg agctaatcaa ggattattac agtttttacg 
caattacaag cctacatcea atgacaggtc ggaaaatagt 
agatgaaggt acgagcaaac ccgagaaaac gtttatatac 
gcgaaacaaa tcaaacactc gaacgacagc gttgaagatt 
atatgaccct aggccataac ggcgacggta tcaatgaagt 
tggtcataag acattgcaag otcgtttgta tcatgattat 
gagaaagccg tagacgaaca ctacaaagaa taccgagcga 
aaceggaatt caccaccgac ccatcgccat atacaaatgc 
aacgaaaact acttaeatga cgcaagcccg tccaggtaat 
ggacaattca ttgatagatt geccgtcaaa aacggcggtc 
atggagaart atggacttac ccagctgtat tggacagtaa 
tagaaccgga gaaataactt atggtaaega aotgcaagat 
acgccagcga cccataaccc catagaaaat ctaatgattt 
aagctaagaa cccatcgaat ctcattgaag taagaagtgc 
attgtatcaa atggacacac ctatggaaea cactccagat 
gcaggtatct tatattggta tacaggtgat tcgaacacag 
taaaaacaaa agaattgtta tttaaaogac gtatcgatat 
cttceaagaa getgaggg t e tagatacgta ttacgatcta 
gcaactattg gacctggtaa taacagacat cactcaatte 
tcetaaaaaa caccgcaccc caagtatcga tgactgattc 
gaacccagca tatccaagtg atattacgga agttggtcac 
gcaccagacc tcccgrtacc gaaagcgrtt agagatgcag 
acaacggcgc tctaagacaa gtactcacca gaaacagcac 
cattgacatt ttcaataaga aaaacaacgg agcatggaac 
catatcccta agagtaccac aaaactacca gacttaaaaa 
aagaaccaaa aegatttact gatctcccta aagactttaa 
atogaacaca ccaggtaaca caaeacaagt attaagacgt 
gtcagaaact tcggtaccgg tggcgttggt aaatggagtt 
agcagacaae ttttegaaag acgataactt aatcgagtta 
gacacaaaca tcagtttcta cgaatcogat agaggaactg 
acagaccgtt aectataagt tctgaacatg ttaaaacatc 
agatagaggc gcttatattt cagacgaate aaegatagta 
ataccgaacg aatcctcaaa acattcaggc aaggtgcatg 
ataatgttgt tgttgaacgt caatttagct tcaatattga 
aacaaagcet gtttatacca aatctactca agatactatc 
aagcaagata tggatgatac acaaacgtta atagcaaaag 
aaatcgaaac caagcaaaac gaagctatac aagctaccac 
cacagetgaa gtcgataaaa tagccgaaaa agagcaagcg 
caaatcaatg gcgctgaccc tgrcaaaggt aacccaacaa 
ateaeggtaa agcaattgaa ccgtatgagc agtccacaga 
gattattcac aceaceaatg caacagatgc gccagaaaag 
caagacggtg ttgacgacgg ttcttcgttc gatgaatcaa 
ttgtttatgt tgttgataat aacactgccc gtgcaacatg 
aaaatacaaa atctacggca catggtacec gtcctataaa 
gttgaagaaa cgtctaacaa cgctttaaat caagccaagc 
gctggcaaca acataagatg acagaggcga otggtcaatc 
cgatttggga tatttaactg ctggtaatta ccatgcaaca 
agctacgagg gttatttatc ggtattcgtt aaagacgata 
actctaaaaa gatttacaca cgatcaatca caaacggcag 
acataagtca aoggcaccgt tcgacggtgg agcaaacggt 
cacacaaact actctatttt attagtaagt ggaacttacc 
ccacactacc taatgcaatt caatcaagta aagcgaatgt 
ttatgagtgt tcaetaccca aaacaagtag cactacttta 
ggtaaaacat caggttctgg agogaatgcc aacaaagtca 
gaaaatcaca gtaaatgata aaaatgaagt tetcggatac 
gatgtagacg ataacaatgt gtctatcaaa ctcaaagaag 
acggcgaaat taaatacaat ageaatttcg aaaaagaaga 
gtcagattta agtgatgagg aacttcgcgg aatggttgea 
atgttgacaa tgcaattgac gcaacaaaac gctacgttaa 
aaacaaatac cgagggggac gtttaaatga cgaagatgat 
ctatgtgtgg ggttgctaea aaaatgagca aattaagtgg 
gaatacgcat cgaccactgg tgaaaaatat ccagaggcaa 
ggctttttaa ttcaacacaa agtaggtggc gtaacgtttg 
gaattagaag attagaagag aacgataaaa caatgcttag 
aactcaagag caagtcaaca ttaaattaga taaaacttta 
gaaaaaaata agaaagaaaa cgacaaaaat acacgcgata 
ccatcttcag tacgattgtc atagctttac taagaactat 
cttaaaggga ttttaggata tagcttctgg gcgtgcttct 
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36(81 ggtttggcaa atgtaaataa cagtcaagag ccagcgctcc ggcactggct ttttatcttg attgaaatga 

36751 ggtgcataca tgggactaee caacccaaag actagaaage ctacagctag egaagtggtg gagtgggcaa 

36621 agtcgaatat cggcaagagg attaaeatag ataaecatcg gggcagteaa cgtcgggata cacctaactt 

36891 tatctttaa* agatattggg gttttgtaac atggggcaat gctaaggata tggctaatca cagatatcct 

36961 aagggtttcc gattccatcg ttattcatct ggatttgtac cggaacctgg agacatcgca gtttggcacc 

37031 ctggcaacgg aataggttcg gacggacaca ccgcaatagt agtaggacca tetaataaaa gccattttta 

37101 tagcgttgac caaaactggg ttaattctaa tagttggaca ggttctccag gaagatcagt aagacacccc 

37171 tatgtaagtg ttacaggccc tgctaggcct ccatactcaa aagatactag caaacetagt agtactgata 

37241 caagtccagc accaaaagcc aatgactcaa caattactgg cgaagcgaag aaaccgcaat etaaagaagt 

37311 taaaacagta aaatacactg cttacagcaa tgttttagat aaagaagagc acttcattga tcatatagtt 

37381 gtaatgggtg atgaacgctc agatattcaa ggatcacaca taaaa g aatc aatgcatatg cgtcctgtag 

37451 aogaactgta tacgcaaaga aataagttta taagcgatca cgaaacaccg catttatatg ecgatagaga 

37521 ggctacatgg cttgctagac caaccaatcc Cgatgacccg cgccacccta attggctagt tattgaagta 

37591 tgtggtggtc aaacagacag caaacgacaa ttcteattga atcaaataca agcgtcaata cgtggtgttt 

37661 ggttattgtc agggaccgat aaaaacttac ctgaaacgac gttaaaggta gaccctaata ttcggcgtag 

37731 catgaaagat ctaattaatt acgacttgat caagcaaggt ataccggata acgcaaagta tgagcaagtt 

37801 aaaaagaaaa tgcttgagac atacattaaa cgagatatat cgacacgaga aaatataaaa gaagtaacga 

37871 caaaaacaac aataagaatt agtgacaaaa caccagccga cagtgcgtcc acaogaggcc ccactccatc 

37941 agacgaaaaa ccaagcaccg ctaccgaaac aagtccactc acactccagc aagcactgga tagacaaatg 

38011 cccaggggta acccgaaaaa atctcataca tggggctggg ctaatgcaac acgagcacaa acgagctcgg 

38081 caatgaatgt caagcgaata Cgggaaagca acacgcaatg ccatcaaatg cttaatttag gcaagtacea 

38151 aggcatctca gttagtgcgc tcaacaaaat acccaaagga aaaggaacgc tcgaoggaca aggcaaagca 

38221 ttcgcggaag cttgcaagaa aaaca&eact aacgaaaete atttgaregc gcacgctttc ttagaaagtg 

38391 gacacggaac aagcaacttc gctagtggta gataeggtge atataattac ttcggtactg gtgcattcga 

36361 caacgaccct gattatgcaa tgacgtttgc eaaaaataaa ggttggacat ctccagcaaa agcaatcatg 

38431 ggcggtgcta gcttcgcaag aaaggattac atcaataaag gtcaaaaeac ottgtaccga attagatgga 

38501 atcctaagaa cccagccacc caccaataog ctactgctat agagtggtge eaaeateaag caagtacaat 

38S71 cgctaagtta. tataaacaaa tcggcttaaa aggtatctac ttcacaaggg ataaatataa ataaagaggt 

38641 gtgtaaatgt acaaaataaa agatgttgaa acgagaataa aaaatgatgg tgttgaetca ggtgacattg 

38711 gctgtcgatt tcacaccgaa gatgaaaata cagcatctat aagaataggc atcaatgaca aacaaggtcg 

38781 tatcgatcta aaagcacacg gcttaacacc tagattacat ttgtttatgg aagatggccc tatattcaaa 

3 88 51 aatgageccc ttattatcga cgatgttgta aaagggttcc ttacctaeaa aatacctaaa oaggrtatca 

38921 aacacgcegg ttatgtccge tgtaagctgt ttttagagaa agaagaagaa aaaatacacg tcgcaaactt 

38991 ttcttccaat atcgttgata gcggcactga atctgcxgta gcaaaagaaa tcgatgttaa attggtagat 

39061 gatgctacca ogagaatttt aaaagataac gcgacagatc tattgagcaa agactttaaa gagaaaatag 

39131 ataaagatgt catttcttac accgaaaaga atgaaagtag atttaaaggt gcgaaaggtg ataaaggega 

39201 aecgggacaa cccggrgcga aaggtgatac aggcaaaaaa ggagaacaag gcgcacccgg taaaaacggt 

39271 aetgtagtac caatcaatcc tgacactaaa atgtggcaaa ttgatggtaa agatacagat accaaagcag 

39341 aacctgagtt attggacaaa atcaatatcg caaatgttga agggttagaa gataaactgc aagaagccaa 

39411 aaaaatcaaa gacacaaetc tcaacgactc taaaacgtat acggattcaa aaattgctga accagttgat 

39481 agcgcgcctg aatctatgaa tacattaaga gaattagcag aagcaacaca aaacaactct atcccagaaa 

39S51 gtgtactgca acagattggc tcaaaagtta gtacagaaga ttttgaggaa ttcaaacaaa oactaaacga 

39621 tt tat at get ccaaaaaatc ataatcatga tgageggtat gttttgtcat ctcaagcttt taccaaacaa 

39691 caageggata atttatatca actaaaaagc gcatetcaac cgacggttaa aatteggaca ggaacagaaa 

39761 atgaatataa ctatatatat caaaaagacc ctaatacact ttacttaatt aaggggtgat etttaeggaa 

39831 ggtaatttta aaaatgtaaa gaagtttatt tacgaaggtg aagaatatac aaaagtatat gctggaaata 

39901 tccaagtatg gaaaaagect tcatcttctg taataaaacc cttacccaaa aataaatatc eggatagcat 

39971 agaagaatca acagcaaaat ggacaacaaa tggagttgaa cctaataaaa gttatcaggt gacaacagaa 

40041 aatgtacgta gcggtataat gagggtttcg caaactaatt taggttcaag tgatttagga atatcaggag 

40111 teaatagegg agttgcaagt aaaaatatca actttagtaa tcctccaggg atgttgtatg tcactataag 

40181 tgatgtttat tcaggatcto caacattgac cattgaataa ttttaaacga ctaatttttt agtcgttttt 

40251 tattttggat aaaaggagca aacaaatgga tgcaaaagta ataacaagat acategtatt gatcttagca . 

40321 ttagtaaatc aattcttagc gaacaaaggt attagecega ttccagtaga cgatgagact atatcatcaa 

40391 taatacttac tgttgttgct ttatatacta cgtataaaga caatccaaca tctcaagaag gtaaatgggc 

40461 aaatcaaaag ctaaagaaat ataaagctga aaacaagtat agaaaagcaa eagggcaagc gecaattaaa 

40531 gaagtaatga cacctacgaa tatgaacgac acaaatgatt tagggtaggt gctgaccaat gttgataaca 

40601 aaaaaccaag cagaaaaatg gtttgataat tcattaggga ageagttcaa tcctgacttg ttttatggat 

40671 ttcagtgtta egattacgea aatatgtttt ttatgatagc aacaggogaa aggttacaag gtttataogc 

40741 ttataatatt ccatttgata ataaagcaag gattgaaaaa tacgggcaaa taattaaaaa ctatgatagc 

40811 tttttacege aaaagttgga cattgccgct ttcccgtcaa agtatggtgg cggagctgga catgttgaaa 

40881 ttgttgagag cgcaaattta aacacttcca catcatatgg gcaaaattgg aatggt&aag gttggacaaa 

40951 tggcgttgcg caacctggtc ggggtcctga aactgttaca agacatgtcc attattacga tgacccaatg 

41021 tattttatta gattaaattt cceagataaa gtaagtgttg gagataaagc taaaagcgtt attaagcaag 

41091 caactgccaa aaagcaagca graattaaac ctaaaaaaat tatgcttgta geeggtcatg gttataacga 

41161 teetggagca gtaggaaacg gaacaaacga aegegatttt ateegtaaat atataacgee aaatatcget 

41231 aagtatctaa gaeatgeagg tcatgaagtt gcattatacg gtggctcaag teaatcacaa gacatgtatc 

41301 aagatactgc atacggtgtt aatgtaggaa ataataaaga ttatggacta tattgggtta aatcacaggg 

41371 gtacgacatt gttctagaga ctcatttaga cgcagcagga gaaaatgcaa gtggtgggca tgttattatc 

41441 tcaagtcaat ecaatgegga cactactgat aaaagtacac aagatgttat taaaaataac ttaggacaaa 

41511 taagaggtgt aacacctcgt aatgatctac egaaegtcaa tgtatcagca gaaataaata tcaattatcg 

41581 tttatctgaa ctaggtttta ttactaacaa aaaagatatg gattggatta agaagaatta tgacttgtae 

41651 tctaaattaa tagctggtgc gatccatggt aagectatag gtggtttggt agctggtaat gttaaaacat 

41721 cagctaaaaa ccaaaaaaat ccaccagtgc cagcaggtta tacaottgat aagaataaeg tgcctcataa 

41791 aaaagagact ggtaactaca cagttgccaa tgttaaaggt aataaegcaa gggaeggcta ttcaactaat 

41861 tcaagaatta caggtgtatt acctaataac gcaacaatca aatatgaegg cgeatactgc atcaatgggt 

41931 atagatggat tacttatatt gctaatagtg gaeaaegteg etatattgeg acaggagagg tagataaagc 
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42001 aggtaatagg ataagtagtt ttggtaagtt 

42071 cattMCcat agggaatctt acagttatta 

42141 ttaacattac tctcaagatt caaatgtaga 

42211 taatgtaatt acattaccag Eaaccaatct 

42261 gaggacttac ttgcgxaaag tagtaagaag 

423S1 grtgtccttt atgttatatt ataaatgacc 

42421 tatgcaaaaa aaacgaaaaa aagrtcataa 

42491 ataccagxtg agoggaggat aaaaagtgtt 

42561 atgtcagcaa ttgccatagc gaaaaeattg 

42621 tctatataca aattctaaca ctaaaatact 

43701 taaacgtgtt tctaggcaac gatataagta 

42771 tttatggaag agggacaaaa atgacagcaa 

42841 agaaacggga tataaaattg ctaaaaattc 

42911 aaaaoatctt tatcagatgc cagatttaga 

42981 acgaagaaga taaataaaag gagccaaaaA 

43051 aaagaagtat ctgaaccagg taaaaacctt 

43121 atgatagaca cgcagcactc gaccacaaaa 

43191 caaaagaaaa ttagtaagtt aaataattag 

43261 cgcgtgtcaa atacgtgtca attcagrtct 

43331 cgcacagtta taggcttttc agctatatac 

43401 cggaaacctt gacttaatgg ggtctcaacc 

43471 cacgctgacc ctgctctttt ttatgttcac 

43541 ataatggcct aatcttttgc caatatattc 



198 

tagcacgatc tagtatttac ttagaataaa aattttgcta 
aataactatt tggatggatg ttaacatccc, tacacacttt 
taacaggcag gtactacggt acttgcctat tttcttgtta 
ggcctaaaac cacarttccg gtagccaatc cggccacgca 
ctgactgcat attcaaacca cccataccag ctgctgggtg 
aaaccacacc acetattaat ctaggagtgt ggttattttt 
aaagtattge atatcacgtt caaccgtgtt ataataaggt 
agaaaatttt aaaaccacag cagaaatcgc cttttataca 
aaaaaagacg ataagtaagt agacaagccc gaaagggctg 
argaaaacaa cttacactat tttaatcact cctatttgga 
aaagtgttgt tgcactgett acCactttac tgcttaccaa 
taaaagaaat aattgaatca atagaaaagt tattcgaaaa 
eggat caeca tatcaaacrg tgeaagaect aagaaatgga 
acgataataa agttatacga gtatcaaaga tegctcgaaa 
tatgtttgtt acaaaagaag aattta&aac tccgaatgta 
ataaaaatta cagatggaag acatgeaata tattgggtaa 
aaggcgattt gtacccgcaa aaageatacc caaaatacat 
aaaaccacgt cctaattgac gtggttattt tttaggtttg 
atttctttag ttttctttct aaacttaatt getegtaaac 
caagataaga cttatcccgc cgtcrccata aaaatatget 
tagcaagtgt caaataegtg tcaagaaaat aatcctctga 
caagtaagtg agagtaggtg tctaaagtta tagacatatt 
aatagg 
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Table 10 



Bacteriophage 96 ORFs list 



SXD 


LAN 


FSA 


POS 


a.a . 


1 RBS •«qu«nce 


STA 


8 TO 


100713 


96ORF001 


1 


2S999. .29142 


1047 


ccttqaatcgaaaggaggttBgect 


ttg 


taa 


100734 


96ORF0Q2 


1 


32008. .33906 


632 


c 1 t 1 1 a cga c t aaaggaqgcaacc a 


atg 


taa 


100735 


96CRF003 


1 


30109. .31995 


628 


c t a t at c t t ogat aaggagragcc t 


atg 


taa 


100736 


96ORF004 


1 


36760. .38634 


624 


a t c 1 t gat tgaaa tgaggtgcat ac 


atg 


taa 


100737 


96ORF005 


3 


33903. .35729 


608 


gt t tattcgaaqqaaaggtgqttga 


ata 


taa 


100738 


96CRFO06 


2 


40589. .42043 


484 


aacga t c t agggt agg t gt tgacca 


atg 


tag 


100739 


96ORF007 


1 


18652. .20091 


479 


tatacacacataccaaacctgaaog 


att 


tga 


100740 


96ORF008 


2 


8960.. 10201 


413 


tggcagaat t tgggggcgat aacga 


atg 


tga 


100741 


96ORF009 


2 


17447. .18670 


407 


gacgcaa t aacgg aagt gat eg tea 


atg 


tga 


100742 


96ORFO10 


1 


38647. .39819 


390 


taaatataaataaagaggtgtgtaa 


atg 


taa 


100743 


96ORF011 


-1 


119. .1195 


358 


gtagctcgcctacecttattatttt 


ttg 


taa 


100744 


96ORF012 


2 


20045. .21013 


322 


tttaacgaeaaattaectgacatag 


atq 


tqa 


100745 


96ORF013 


3 


29157. .30098 


313 


act t a t t a taagggaggt t c gt tag 


ttg 


taa 


100746 


96ORF014 


1 


21925. .32639 


304 


agaaaa taaag tgagg t aat aaaat 


atq 


taa 

Z3 


100747 


960RF015 


1 


5812. .6591 


259 


a t acacoq t aaaqg t qqqaq aa t a q 






10074B 


96ORP016 


1 


7852. .8607 


251 


aa t aaaat q t t qaaaqqaqaqaaa a 


atq 


taa 


100749 


96ORF017 


3 


3444. .419C 


24B 


aaatt taacatt aatat cactt taa 


qtq 


taa 


100750 


96ORF01B 


-3 


28281. .29000 


239 


taagctatgttqaacatcqctaqtc 


atg 


tqa 


1007S1 


96ORF019 


3 


7188.. 7859 


223 


t ttaccgt tctaggacgtqqttt aa 


atq 




100753 


96ORF020 


3 


21324. .21908 


194 


gaagggcaaaaaggagttttgacat 




taa 


100753 


96ORF021 


3 


6612. .7175 


187 


at t aaaaat t aat t aaaaggacggt 


ata 


tag 


100754 


96ORF022 


2 


24536. .25093 


18S 


aaagaaaaacgaaggagtgtattaa 


atg 


taa 


100755 


96ORF023 


1 


5275. .5811 


17B 


catgaaategtaggaggtatgaaaa 


gtg 


taq 


100756 


96ORF024 


3 


14481. .15014 


177 


taaaacgataggagataacgaataa 


atg 


taa 


100757 


96ORF025 


2 


25157. .25666 


169 


ataaaaaaattgaaaagaggtatat 


att 


taa 


100758 


96ORF026 


-3 


15084. .15590 


16B 


ccat t c 1 1 aaca t ageect taat tc 


atg 


taa 


100759 


96ORF027 


-1 


1229. .1732 


167 


aatagcaaataaaggagtgtaaaac 


atg 


taa 


100760 


96ORF028 


1 


16960. .17454 


164 


aaggcgta t gat ac agt gaaaacaa 


ttg 


taa 


100761 


96ORF029 


-1 


1736. .2227 


163 


taegagaaaaggagtcatataaaag 


atg 


taa 


100762 


96ORF030 


1 


25531. .25995 


154 


ccttcaagagggagagtcgctcgta 


ctg 


tag 


100763 


96ORF031 


2 


23633. .24097 


154 


t t t agt a t tgaagg tgat t ctgt ag 


ate 


tag 


100764 


96ORP032 


-2 


2248. .2706 


152 


ataagacaccaaaggggtttggcgc 


•'3 


ega 


100765 


96ORF033 


-3 


39147.. 39605 


152 


agcatataaatcgtttagtgtttgt 


ttg 


taa 


100766 


96ORF034 


2 


13181. .13615 


144 


tagaagtcgaaaaagtggaggcaat 


ata . 


taa 


100767 


96ORF03S 


2 


1062B. .11053 


141 


gagct aggat c gcaagc aacg at at 




tga 


10C768 


96ORF036 


2 


24110. .24535 


141 


gtaettttcatagaggtggttaaat 


atg 


taa 


100769 


96ORF037 


1 


12583. .12996 


137 


atgaggaacagaagca accaac t t t 


att 


tga 


10C770 


96ORF038 


1 


15628. .16032 


134 


atgttaagaatgatgcctagtttaa 


"3. 


taa 


100771 


96ORF039 


3 


39816. .40220 


134 


c t aa t acactt tact t aat taaggg 


gtg 


taa 


100772 


96ORF04 0 


-3 


2752B. .27932 


134 


tttccataaataaacgaggacacca 


atg 


tga 


100773 


96ORF041 


3 


16206. .16607 


133 


gatgagggcggaggtgtcagagtag 


atg 


tga 


100774 


96ORF04 2 


2 


35720. .36106 


128 


aagttactataactaaaattatggg 


gtg 


taa 


100775 


96ORF043 


-2 


35713. .36081 


122 


ttaaacgtccccctcagtatttgtc 


ttg 


taa 


100776 


960RF044 


-2 


94S0..9B2B 


122 


agtatccatcagttgaagataatcc 


ata 


taa 


100777 


96ORF045 


-3 


5139. .5504 


121 


ttcttettgtatt ctgt aatat tea 


att 


tga 


10077B 


96ORF046 


2 


11513. .11872 


119 


aagtaaatgtatagaggtggaataa 


atg 


taa 


100779 


960RF047 


2 


22991. .23350 


119 


g t cgt act aegt ct gat aaqagcg a 


gtg 


tag 


100780 


96ORF048 


3 


8607. .8963 


lie 


tggaaaa ag aa t tgagt gatgact a 


atg 


tga 


100781 


96ORF049 


1 


23353. .23697 


114 


atccgtttaaaccaataaggtagag 


gtg 


taa 


100782 


960RP05Q 


-2 


2728. .3072 


114 


tggtaaattagtattacattaagta 


ata 


taa 


100783 


960RFDS1 


3 


4692. .5021 


109 


tcaaaatatacggaggtogtcaact 


atg 


tga 


100784 


96ORF052 


-I 


20882. .21211 


109 


gt agcaaagagacaact aaaaaagt 


gtg 


taa 


1007BS 


960RF053 


1 


40252. .40S78 


10S 


acgactaattttttagecgtttttt 


att 


ta 9 . 


100786 


96ORF0S4 


1 


4942. .5262 


106 


aatat aaaactaaaaaacaaaattt 


atg 


tag 


1007B7 


960RF055 


-2 


4840. .51S1 


103 


ccgtcgcaatatatagttcgcttaa 


ate 


taa 


10078B 


960RF0S6 


3 


36324. .36623 


99 


aat t t aacacaaagt agg t ggcgt a 


atg ■ 


«taa 


100789 


960RF057 


2 


1394. .1690 


98 


cttcagtggctcttttagcatttaa 


ata* 


taa 


■ M.W-l.M 


96ORF058 


-3 


26247. .26537 


96 


tacttcttttcccataatecgacca 


att 


tga 




96ORF0S9 


-1 


21485. .21772 


95 


agactcaacgcctttttgaacatac 


"3 


tga 




96ORF060 


-3 


22647. .22931 


94 


cct ct 1 1 gt aaccgacaag act gt a 


ata 


taa 




96ORF061 


1 


14023. .14304 


93 


1 1 at ct aat t aaggggga cgagt ga 


P*9 , 


taa 




96ORF062 


-2 


38281. .38559 


92 


tatataacttagcgattgtacttgc 


"9 
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10079S 


96ORF063 


•3 i 30786.. 31064 


92 




gtg 


tga 


100796 


96OKF064 


-2 


30205. .30480 


91 


atgcatctacttttggatgtaatac 


ata 


tag 


100797 


96ORF065 




2617. .2886 


89 


aaggt ctaataaaaat ttctecttc 


ttg 


taa 


100798 


96ORF066 


3 


28056. .28325 


89 


aaggt gtagtcggccggttaaetga 


act 


taa 


100799 


96ORF067 


-3 


17142. .17411 


B9 


ttccgctattgcgtcgtgaagttgc 


ttg 


_ eg* 


100B00 


96ORF068 


2 


12326. .12589 


87 


aatqcatgtcgtttggtctgcctaa 


. 


tag 


100801 


960RP069 


2 


42734. .42997 


B7 


cttttaggcaacgatataagtaaaa 




taa 


100802 


96ORF070 


1 


11869. .12129 


86 


aaacgc c caagaaacggagtgaagc 


ata 


taa 


100803 


96ORF071 


3 


15396. .15655 


86 


aacaagcc a tacaaat eatcqac aa 


att 


taa 


100804 


96ORF072 


-3 


37749. .38009 


86 


agat 1 t 1 t 1 egggt t acccct agac 


att 


taa 


100805 


96ORP073 


3 


11244. .11501 


85 


acaegcatataeagaggtggaacaa 


atg 


tag 


100606 


960RF074 


-3 


42936. .43193 \ 85 


aattattcaaettaccaattttctt 


ttq 


taa 


100807 


96ORF075 


-3 


26610. .26867 


65 


tactgccaacgtcccatcttcaacc 


act 


taa 


looaoe 


96ORF076 


-1 


11126. .11380 


84 


tttatctaatacatttaagceaacc 


ate 


taa 


100B09 


9608 FO 7 7 


•2 


16537. .16791 


84 


e acccaec acacaggcagg cage aq 


gtq 


tag 


100610 


960RP07B 


-3 


19521. .19775 


84 


aaeaaccttgaattgacacctcaac 


aca 


tga 


100811 


960RP079 


3 


13608. .13859 


83 


1 1 agggcaaa tggaggcagacacaa 


atg 


tag 


100812 


9 6 OR FO 30 


-3 


26029. .28280 


83 
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ate 


tga 


100813 


96ORF061 


3 


20973. .21221 


62 


aatgaagttatcccatccatgactt 


ate 


tag 


100814 . 


96ORP082 


-1 


8729. .6974 


81 


cgat t a t egtgct 1 1 caac t t c aaa 


ttg 


tga 


100815 


96ORF083 


-3 


3147. .3392 


61 


cttagectttatataatcaactcct 


gtg 


tga 


100816 


96ORF084 
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1611. .1853 


BO 


tgctttatctttagtttctttcttt 


ctg 


tga 


100817 


96ORF085 


-2 


29470. .29709 


79 


ctcttatcaccttcgtttqtaqqca 


ate 


taa 


100818 


96ORF066 


1 


35188 . .35424 


78 


gcqcaaggcgatttgggatattcaa 


ctg 


cag 


100819 


96ORF087 


-2 


13039. .13275 


78 


ttttgattgagetetaaagtgrctr 


att 


tag 


100820 


96ORP088 


3 


24930. .25163 


77 


gaactaccattaaaagctaaatgga 


ata 


tga 


100821 


96ORF089 


-3 


22329. .22562 


77 


C ccag c a taagatagt ggtaat cc c 


ata 


taa 


100822 


96ORF090 


-3 


16603. .17036 


77 


acctttagtcgaataecctgegtea 


ata 


tag 


100823 


96ORP091 


-1 


22S59. .22789 


76 


aacgctcctggcttaacgtccatgt 


atg 


taa 


100824 


96ORP092 


3 


16360. .18567 


75 


attgcaaaagatattgtaagtagat 


atg 


taa 


100S2S 


96ORP093 


-2 


25384. .2S60B 


74 


catgatccccttgtaattctctttc 


ate 


taa 


100626 


96ORP094 


1 


10417. .1063B 


73 


aacacacaccaaqgagtgtcaaaaa 


atg 


tag 


100627 


960RP095 


3 


12963 . .13184 


73 


t act aaacgaagat aaaac t a tgac 


att 


taa 


100828 


96ORP096 


1 


42994. .43212 


72 


gac cgc t c gaaaacgaagaaga c aa 


ata 


taa 


100829 


960RF097 


-1 


36047. .36265 


72 


c caagcac t acacct gtgact 1 1 1 c 


ate 


taa 


100830 


96ORF09B 


-2 


36766. .36984 


72 


caggttccggtacaaatceagatga 


ata 


taa 


100831 


96ORF099 


-2 


34765. .34983 


72 


tcattcttettataaaaegggtaee 


atg 


tag 


100832 


96ORF100 


1 


10198. .10413 


71 


acaagaagactcagaggtcctccac 


atg 


taa 


100833 


96ORF101 


1 


15208. .15423 


71 


gagaaacaagc t a agac aaggagag 


atg 


tga 


100834 


96ORF102 


3 


4209. .4424 


71 


ate C taaaacgaaa t at aggagagg 


ctg 


tag 


10083b 


96ORF103 


3 


11673.. 118BB 


71 


catgcaccttatggtatgcgcttag 


ctg 


taa 


100836 


96ORF104 


3 


12117. .12332 


71 


1 1 t aegt ccaaagagc 1 1 1 tgact t 


gtg 


taa 


100837 


96ORF105 


3 


23892. .24107 


71 


gatggtgggct a t ccagt gt t at aa 


gtg 


taa 


100638 


96ORP106 f 


-3 


34426. .34643 


71 


tagacccttgccaatttgttgttga 


att 


taa 


100839 


960RP1C7 


-3 


24495 . .24710 j 71 


ggcacat t accaat t g 1 1 aa 1 1 1 aa 


atg 


taa 


100840 


96ORF10B 


♦1 


23876. .24088 J 70 


acatacccaaccacccctatgaaaa 


ata 


taa 


100841 


96ORF109 


-2 


17317. .17529 
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att 


taa 


100842 


96ORF110 


-3 


38931. .39143 


70 1 actttcattcttttcgatgtaagaa 


a eg 


taa 


100843 


960RP111 


-3 


21B5S. .22067 


70 


agtaaattttttcttttgtgctgtc 


att 


tga 




960RF112 


1 


3217. .3426 


69 


aaacgc caacgggaggtgatacgaa 


atg 


taa 


100845 


960RF113 




25469. .2S678 


69 


tcagggatatatcctaaatatctag 


ccg 


taa 


100646 


960RF114 


-2 


9838. .10047 


69 


ataataatcaccacggtaaagcagc 


ate 


tga 


100647 


960RF115 


1 


13819. .14022 


67 


gcagtaggggt t atggcaggc caag 


"3 


tga 


10084B 


960RF116 


-1 


41033. .41236 


67 


caacttcatgacctqcatqtcttaa 


ata 


taa 


100849 


960RF117 


-3 


24711 . .24914 


67 


CCtgctqtattccatttaacttcta 


atg 


taa 


100850 


96ORP110 


-1 


12374. .12574 


66 


cccatctcctctaaaataaagtcgg 


ttg 


taa 


100851 


960RP119 


•1 


3980. .4180 


66 


ccectacatttcgttttaaaattcc 


att 


tga 


100852 


96ORF120 


-3 


6013. .6233 


66 


ttgtaatttaqaaacataacgataa 


ata 


taa 


100853 


9 6 OR Fl 21 


•2 


37939. .38136 


65 


ctgaaatgccctgatacttgcccaa 


att 




100654 


960RF122 


2 t 


37892. .38086 


64 


acqacaaaaacaacaaeaagaacta 


3tg 


tga 


100855 


960RF123 


-3 


29193. .29387 


64 


ggacgtctgaccttaaatgcgaagc 


ata 


c ? a 


1008S6 


960RF124 


1 


4408. .4599 


63 


cccatcggtaccaatttaatqatta 


atg 


taa 


100B57 


960RF125 


-1 


7787. .7976 


63 


t t aaaaa t ccaaqt t t tgeca t cgt 


att 


cga 


100B58 


960RP12G 


-3 


27027. .27218 


63 


aaattcgaaeaaeggeattaatega 


gtg 


tga 


100859 


96CRP127 


3 


15051 . .15239 


62 


a t cgagc caaggagg t z c eggogaa 


gtg 


tgj^. 


100860 


960RF126 


-1 


6914. .7102 


62 


agegaatgggctcqat cgt tgact c 


ata - 


• c 9* 


100861 


960RF129 


-3 


31332. .31520 


62 


ccttacttgctctgcttgcceataa 


ac 3 


122— 


100862 


96ORF130 


-3 


30084. .30272 


62 


aaaatcatcttcaccttcaacacga 


gtg 


taa 


100863 


960RF131 


3 


11058. .11243 


61 


agaaaaagagaaatgaagegatcca 


atg 


taa 


100864 


960RF132 


-3 


36434 . .36619 


61 


caagcatggtaatcacctcctttaa 


ata 


tga 


100865 


960RF133 


•1 


35591.. 35776 | 


61 


ccaaaccatcgegeaaaccgeeagc 


att 


taa 


100866 


960RF134 


-2 


9250.. 9435 | 


61 


acceatgagcttataaeecgrctta 


att 


tga 
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100067 


960RF135 t 1 


29563. .29745 


CO 


cgacaac t 1 1 1 tgtaqgaceagr aa 


gtg 


tga 


100868 


960RF136 | -3 


12486. .12668 


60 


cactttactttcaacttgttcagga 




taa 


10086 9 


960RP13? 


-1 


14501. .14660 


59 


caaactgaaaqctaagtaatcagca 


ate 


tq- 


100870 


960RP136 


-2 


23326. .23S0S 


59 


cttgtgacatetgatgaaatttcag 


ttg 


tga 


100871 


960RF139 


-3 


42672. .42851 


1 59 


aatccggaacetetagcaattctat 


ate 


taa 


100872 


96ORF140 


•3 


31137. .31316 


59 


act t gat cqactaq t aaaqt cqt ac 


atg 


taa 


100873 


960RP141 


-3 


18969. .19148 


59 


aacaaaaataacattatagggatct 


ata 


taa 


100874 


960RF142 


-3 


4740. .4919 


59 


cataaattttgttttttagttttat 


att 


tga 


100875 


960RF143 


2 


36107. .36283 


56 


a acaaac act gagggggncgt t t aa 


atg 




100876 


960RF144 


3 


16029. .16205 


56 


tatacgaagtaaaqaaggtagataa 


ata 


c «g 


100877 


960RF145 


-3 


29013. .29169 


58 




att 


tga 


100876 


960RF146 


-3 


148B3. .15059 


58 


a atct 1 t gaatgt tgc gac taagt a 


ttg 


taa 


100879 


960R7147 


-1 


18251. .18424 


57 


cat cagcgt c aat t gcacg taat ct 


atg 


taa 


100880 


960RF14 8 


-1 


13SB3. .13756 


57 


aataccttctttaaccqaatgttga 


ata 


taa 


100881 


960RF14 9 


-2 


10756. .10929 


57 


taaattcacatctctatactgatat 


ctg 


tag 


100882 


96ORF150 


2 


14171. .14341 


56 


atttttaatgaagaagtgtcactaa 


ctg 


tag 


100883 


960RP151 


2 


19217. .19387 


56 


cctacacacccattgcgctactctt 


atg 


tga 


100884 


960RF152 


-1 


12614 . .12784 | 56 


atttctacagtaaaaatatctteat 


ctg 


taa 


100885 


960RF1S3 


-2 


11836 . .12008 


56 


ttgcat tacctat tgcgaatgctag 


ttg 


taa 


100886 


960RF154 


-2 


4165. .4335 


56 
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ate 


tga 


100887 


960RF1SS 


-3 


40464 . .40634 


56 


aaaccaggactgaactgcttcccta 


atg 


tga 


100888 


960RF1S6 


3 
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S5 


tggraattttgataatttagcttta 


ata 


taa 


100689 


960RF1S7 


-1 


41879. .42046 


55 


gtageaaaatttttattctaagtaa 


ata 


taa 


100890 


960RF1S8 


-2 


36166 . .36333 


55 


catccatgttcgtgccgtttggtaa 


ate 


tag 


100691 


960RF159 


-2 


16226 . .16395 


55 


tttaacatccgagcatacctcttat 


ttg 


taa 


100892 


960RF160 


3 


1038. .1202 


54 


atctctaagcaqttgttaagcaqcq 


ttg 




100693 


960RF161 


-1 


19193 . .19357 


54 


tctttgttgttaggtacaccaaaca 


atg 


tag 


100694 


960RF162 


-1 


18074 . .16236 


54 


cccgteccactaacacaatagatcc 


ata 


tga 


100895 


960RF163 


-1 


15366 . . 15SS0 


54 


agccatc&taggactgtaaaattca 


ctg 


taa 


100896 


960RF164 


-1 


10049. .10213 


54 


t acat cgat t tcaat aagct 1 1 1 ga 


att 


• tag 


100697 


960RF16S 


-2 


18514. .1B678 


54 


gcgcttcaatatcatctattaactt 


ata 


taa 


100896 


960RF166 


-2 


11104. .11268 


54 


c cagccatga t c accct t aaat t ag 


ttg 


t»g 


100899 


960RF167 


-3 


13764. .13928 


54 


agacag 1 1 1 a t aatq tg ta tc tct a 


ata 


tga 


100900 


S60RF16S 


1 


14305. .14466 


S3 


ttttgaatttttggaggacgagtaa 


_»=g 


tag 


100901 


960RF16 9 


-1 


17885. .18046 


53 


gtgttgaagccttaatagactcttt 


ata 


tga 


100902 


96ORF170 


-1 


10790. .10951 


53 


t aggcgc c 1 1 acat at ccacgt c aa 


att 


taa 


100903 


960RF171 


-3 


12765. .12926 


S3 


atcttcgtttaqtatataoaacgct 


ctg 


taa 


100904 


960RF172 


3 


22636. .22994 


52 


cgttcgcaacgcttaaaccaactqa 


ata 


tga 


100905 


960RF173 


-1 


15956. .16114 


52 


ct ct aca c c atca c tagccgt eg t c 


ata 


taa 


100906 


960RF174 


-1 


10571. .10729 


52 


tagtgccattcatattactttctaa 


ata 


taa 


100907 


960RF175 


-1 


3440. .3598 


52 


cagcctatcttcactaccaacatga 


ttg 


taa 


100906 


960RF176 


-3 


37170. .3732B 


52 


1 1 ta tct b aaaca t tgccg t aagca 


gtg 


taa 


100909 


960RF177 | -3 


6693. .6851 


52 


ttcctaatctactaagtaacecgae 


ata 


taa 


100910 


960RF17B 


-3 


5655. .5813 


S2 


gacatcttgattagtttttteagtc 


ate 


tag 


100911 


960RF179 


1 


34564. .34719 


51 


gttacagctgaagtcgacaaaatag 


ttg 


tag 


100912 


96ORF180 


1 


42661. .42816 


51 


atataaattctaacactaaaatact 


atg 


tga 


100913 


960RF181 


-2 


37741 . .37896 


51 


ccgacgcaccgtcaaccgatgtt 1 1 


ate 


taa 


100914 


96QRF1B2 


-2 


25039. 25194 


51 


ttegxaatcttcccccccgtcacta 


att 


tga 


100915 


960RF183 


-2 


4534. .4689 


51 


tcagetttaatattttcagceatag 


ttg 


tga 


100916 


960RP184 


1 


6721. .6673 


SO 


ggagctggagaatteacagtaaaag 


ttg 


tag 


100917 


960RF185 


2 


36548. .36700 


so i acaaaaaeataegegatatgaaaat 


gtg 


taa 


100918 


960RF166 


-1 


40025. .40177 


so 


tggagae cc tgaataaaca t cac 1 1 


ata 




100919 


960RF187 


-1 


34466. .34618 


50 


actacctttaacaaggtcagcgcca 


ttg 
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-1 


33842. .33994 


50 


agtecctctatctgattcatagaaa 


ctg 


taa 


100921 


960RF189 


-1 


24914. .25066 


50 


acaeagaatggtettccgtgtgtga 


ate 


taa 


100922 


96ORF190 


-2 


20395. .20547 


50 


tatcttagagtaaccctetccactc 


ata 


tga 


100923 


960RF191 


3 


24768. .24917 


49 


aaaggaattgaagcagtgaaacacg 


«g 


taa 


100924 


960RF192 


-1 


16169.. 16316 


49 


ttgtggtttcggcaacgttgcetgt 


_atg_ 


tga 


100935 


960RP193 


-2 


39100. .39249 


49 • 


cagtaccgtttttacegggtgegcc 




taa 


100926 


9GORF194 


-2 


25921. .26070 


49 


ttggtacagacgtccctgctaatcg 


ttg 


taa 


100927 


960RF195 


-2 


17779. .17928 


49 


caaccaatgctcgggatggtcaggg 


-"3 


tga 


100928 


960RF196 


-2 


14182. .14331 


49 


ttaaatacctttcttccagcaatgc 


ate 


tga 


100929 


960RF197 


-2 


7609. .7758 


49 


ttatcaccaaacgacttaacaccaa 


ctg 


tga 


100930 


960RF198 
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1537. .1666 
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ttattagctagtgcgttagtgttaq 


gtg 


taa 


100931 


960RF199 


-3 


7719. .7668 


49 


taatacttgratcggatagtcaect 


att 


taa 


100932 


96ORF200 


2 


22271. .22417 


48 


ttctttaatgaggttaaacctctaa 


ttg - 




100933 


96ORF201 


2 


30353. .30499 


48 


t c tact a t tggcgaaaaaa t aaqqc 


ttg 


tag 
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96ORF202 


2 


32591. .32737 


48 


agattgaagcccaacggacaattta 


ttg 


taa 


100935 


9CORF203 


2 


39131. .39277 


48 


agcaaagactttaaaqagaaaataq 


ata 


tag 


100936 


96ORF204 


-2 


36965. .37131 


48 
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ate 


tga 


100937 


96ORP205 


-3 
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48 
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ata 


taa 


100936 
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48 
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att 
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100939 


96ORF207 


-3 


11550. -11696 


4 8 | tcgctctctcgctccatgattttgg 


ata 


taa 


100940 


96ORP208 


2 


37178 . . 37321 


4 7 t agattagtaagacacccttatgtaa 


gxg 


taa 


1 00941 


96QRP209 


2 


42341. .42484 


4 7 | tgcatatetaaaccacccacactag 


ttg 


taa 


10094 2 


96ORP210 


3 


41650. .41993 


47 


aaaaotaataacgtaagggaccrgct 


ate 


tag 


100943 


960RF2U 


-1 


6662. .6805 


47 


ctgttggaatggtgggacgaattgg 


..«s 


tga 


10094 4 


960RP212 


-2 


25213. .25356 


49 


agtegcacattcccaaaattgtaaa 


ate 


taa 
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960R7213 


-3 


42219. .42362 


47 


gcggettgatcatttaeaacaeaae 


ata 


taa 


100946 


i 960R7314 


3 


27834 . .27974 


46 


aaaaga 1 1 1 1 agact tcgt t agaac 


ate 


tag 


100947 


960RF215 


3 


35811. .35951 


46 


C tacgeaat agt c tagaege agacg 


ata 


taa 


100948 


960RF216 


-1 


5402. .5542 


46 


tttccgtaaggtgtattcaacttga 


att 


tga 


100949 


960RP217 


-2 


24229. .24369 


46 
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ate 


taa 


100950 
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-2 


6253.-6393 


46 


ttgtcattcttgctaacacgtcaga 


ttg 


taa 


1009S1 


960RF219 


1 


D83. .1020 


45 


aaatcactcccgaaatattcgttaa 


ata 


taa 


100952 


96ORF220 


2 


32936. .33073 


45 


gataaaggtatagacaaagtattgt 


ate 


taa 


100953 


960RF221 


3 


41703. .41640 


45 


ggtaagcccacaqgtggt t tqqtaq 


ctg 


taa 


1009S4 


960RF222 


-1 


39660. .39997 


45 


acttttattaqqttcaaetecattt 


att 


taa 


100955 


960RP223 


-1 


24716. .24853 


45 


acattteaaatgattctggaacaac 


ata 


taa 


100956 


960RP224 


-2 


26794. .26931 


45 


caatatcacgccatgtagtttttaa 


ctg 




100957 


960RP22S 


-2 


19201. .19338 


45 


caaacaatggattgtaatcaaataa 


atg 


tga 


100956 


960RP226 


-2 


15709. .1S846 


45 


tgacccqcttgttgcctaacacaat 


ata 


taa 


100959 


960RP227 


-3 


36711. .36846 


45 


acattgactqccccgataattatet 


ata 


tga 


100960 


960RP228 


3 


2325. .2459 


44 


tcgccacagtgagttccaataccgt 


ata 


taa 


100961 


960RP229 


-1 


3B612. .38746 


44 


ttgtcattgacacctattcttatag 


atq 


tga 


100962 


96ORP23 0 


-1 


31733. .31867 


44 


gctggattgtatggcttaaagtaat 


ctg 


tag 


100963 


960RF231 


-2 


12076. .1221C 


44 


tgacc cat agct t taacc tgt ccgt 


ctg 


taa 


100964 


96QKF232 


-3 


31644. .31778 


44 


acagtcctcaagtgttaaccctagt 


ttg 


taa 


100965 


960RF233 


-3 


23988. 24122 


44 


atttgatttgtaagttcaggcccaa 


ctg 


taa 


100966 


960RP234 


-3 


17529. .17663 


44 


agtacgcttttttgaatcgtaccta 


atg 


taa 


100967 


960RP235 


1 


7153. .7284 


43 


aa cgc t aat ggt ecaac agaaa t ca 


atg 


tag 


100968 


960RP236 


2 


2681. .2812 


43 


ttctttcacctcaactccacatttc 


ata 


tga 


100969 


960RP237 


2 


4496. .4627 


43 


gt act atg c tt cacag t ct tagega 


ttg 




100970 


960RP238 


-1 


41720. .41851 


43 


cacctgtaattcttgaattagttga 


ata 


tga 


100971 


96DRP239 


-1 


35324. .35455 


43 


acttactaataaaatagaatagttt 


3tg 


taa 


100972 


96ORP240 


-1 


8570. .8701 


43 


ateeecgttttgacttaatacatca 


ate 


tga 


100973 


960RF241 


-2 


33S02. .33633 


43 


at aat t z t gt aa t act c t c agggat 


atg 


tag 


100974 


960RP242 


-2 


23662. .23793 


43 


agccaatgctacagcagtgttgtaa 


ate 


tag 


100975 


960RP243 


-3 


32391. .32522 


43 


aectggacgagcttgcgtcatataa 


ata 


tag 


100976 


960RP244 


-3 


30273. .30404 


43 


aaaaett tcgt tat act cttggtaa 


ate 


tga 


100977 


960RF245 


-3 


5695.. 6026 


43 


tgcac c aaaatget tata a 1 1 c 1 1 a 


arc 


taa 


100978 


960RF246 


-3 


2679. .2810 


43 


attCAccaagaaaetatagccggtc 


atg 


tga 


100979 


960RP247 


1 


34691. .35019 


42 


acaccaagcaaatctggtgtgttag 


ttg 


taa 


100980 


960RF248 


2 


30668. .30796 


42 


aattattacactaaagctggtgtga 


atq 


tag 


100981 


960RF24 9 


2 


31838. .31966 


42 


caaatactagcttgtagtgagttag 


atg 


taa 


100982 


96ORP250 


2 


33539. .33667 


42 


cttaccaqaaacaacacaggtagaa 


ata 


taa 


100983 


960RP251 


-1 


20486. .20614 


42 


cttctgtacgagccacacgcaatga 


ttg 


tag 


100984 


960RP2 52 


-1 


1512B. .1S256 


42 


gatatttcattactagctactacta 


ata 


tga 


100985 


960RF2 53 


-2 


41446. .41574 


42 


aaaacctaattcagataaacgataa 


ttg 


tga 


100986 


960RF2S4 


-2 


41005. .41133 


42 


gt tataaccatgaccqqctacaaqc 


ata 


taa 


100987 


960RP255 


•2 


23008. .23136 


42 


aggataaatgacttgaccatctttc 


ata 


taa 


100988 


960RF2S6 




14794 . . 14922 


42 


t t gt at gcgtcaa cgag t eggt cga 


ttg 


tag 


100969 


960RF257 


•2 


6503. .8631 


42 


tacctaacttttttaataatttcta 


atg 


t3« 


100990 


960RP2S8 


-3 


22143. .22271 


42 


aaacgctttgtaaaatgcctctgca 


att 


tga 


100991 


960RP2S9 


-3 


18639. .18767 


42 


cttgtacctattatagagattaacc 


att 


tag 


100992 


96ORF260 


-3 


15624 . .15752 


42 


gtttcggtaactagccactgtatag 


ata 


taa 


100993 


960RP261 


2 


18746. .18671 


41 


ca tat tgaggct ct aat agagt cac 


ata 


taa 


100994 


960RP262 


-1 


13067. .13192 


41 j 


aattaattaattcttctettgttgg 


ttg 


taa 


100995 


960RP263 


-2 


18742. .18867 


41 


taacagacacgtctaaccgccttac 


att 


tga 


100996 


960RF264 


-2 


18376. .18501 


41 


cataccatcataaagaacaagcaac 


ttg 


taa 


100997 


960RP265 


•2 


367. .492 


41 | 


c t aa a cgaaaaagagggt acaatac 


ate 




100996 


960RP266 


-3 


32802. .32927 


41 


aggtacatccatttqatacaatact 


"9 


taa 


100999 


960RF267 


-3 


10194. .10319 


41 


atcatcgaaaggcgataaeccgtta 


ttg 


tga 




1 


1159. .1281 


40 


tzattcttcctttttgtaactgtaa 


atg 


taa 




2 


10373. .10495 


40 


gacagagctgaaaagaaaatcatga 


atg 


taa 




2 


15734. .15856 


40 


ttattcggcgtaetcgcactgatgc 


ttg 


tag 


101003 


960RP271 


-1 


43451. .43573 


40 


c c tNo oblne-dalgarno 
sequence 


act 


.tga ■ 


101004 


960RF272 


-1 


36959. .37061 


40 


acgctacaaaaataacttttattag 


at or 


_t-3_ 


101005 


960RF273 


-1 


35798. .35920 


40 


ctgacgcactccgttggtttgatgc 


att 


taa 


101006 


960RP274 


-1 


6147. .8269 


40 


tctgtetetctatgtttgttagtct 


ctg 


tga 


101007 


960RP2 75 


-2 


43066 . .431BB 


40 


tttaacttactaattttetCCtgat 


ata 




101000 


960RP276 


-2 


42535.-42657 


40 


aaataatgcaaattgttttcatagt 


act 




101009 


960RF277 


-2 


30628. .30750 


40 


ttcgtagtcccgcctctgcaaaagt 
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960RP278 


-2 


13291 . . 13413 


40 


t ccgtat etc ccaageaat tcatt c 


. . ,"9 


tga 


101011 


960RP279 


-2 


3172- .3294 


40 


cagaetgtttagtaaeqcctaattt 


ate 


taa 


101PI ?„. 


96ORP280 


-3 


18804 . .18926 


40 


c aaacaa ccaacaeq c qt atcaaca 


acc 


tag 


101013 


960RF2B1 


-3 


15843 . .15965 


40 


atteaaaaagtgtattecataaeca 


acc 


ta 9 




960RF2B2 


-3 


8460. .8582 


4C 


t tagtcat cactcaact ct 1 1 c t cc 


att 


taa 


101015 


960RP2B3 


-3 


7593. .7715 


40 


gatgttgtctacacaqtgctaacac 


atg 


taa 


101016 


960RF284 


-3 


6453. .657S 


40 


aattaatttttaattaccatttcta 


att 


tga 




960RF285 


1 


15082 . .15201 


39 


caatacttagtcacaacattcaaag 


act 


taa 


101018 


960RF286 




34444 . .34563 


39 




atg 




101019 


960RP2B7 


2 


27920. .28035 


39 


cctattttagcagttqrcgcaqtaa 


"9 


tag 


101020 


960BF288 


2 


28415. . 28534 


39 


a t egget t 1 1 z aactggcqt aatga 


ate 


_tag_ 


101021 


96QRP269 


2 


38147. .38266 


39 


tatcaaatgettaatttaggcaagt 


ate 


tga 


101022 


96ORP290 


3 


40917. .41036 


39 


gcaaaCttaaacaettccacatcat 


atg 


taa 


101023 


960RP291 


-2 


38815. .38934 


39 


z ct ct aaaaacagct t acaqegaa c 


ata 


eaa 


101024 


960RP292 


-2 


32671. .32790 


39 


ccacaggattacaaatcgctgacgc 


ata 


tga 


101025 


960RP293 


-2 


31216.. 31335 


39 


ccgatccqatgtctcttatacttga 


ttg 


taa 


101026 


96QRP294 j -2 


21589.. 21708 


39 


gt ate c c cat eagaat cgect aaaa 


1 ate 


taa 


101027 


960RP29S 


-2 


1B976. .19095 


39 


tatcaacacatgctaacctagcacc 


ata 


taa 


101028 


960RP296 


-2 


11462. .11601 


39 




att 


taa 


101029 


960RP297 


-3 


12933. .13052 


39 


tcAcgaaataatgtttctecaatce 


ata 


taa 


101030 


96QRF298 


-3 


8262. .8361 


39 


gaactgatcttgettaaatgattta 


att 


tag 


101031 


960RP299 


-3 


6993. .7112 


39 


ea 1 1 ageat t agegaaegggt t tga 


ttg 


tga 


101032 


96ORP30Q 


2 


23516. .23632 


38 


occacacccgaacaaccaaaacttc 


ate 


tag 


101033 


96OSP301 


2 


25943. .26059 


38 


agattagaagaagaaaaaagaagac 


gcg 


taa 


101034 


96ORP302 


2 


36929. .37045 


38 


tattggggttetgtaacacggggca 


atg 


tag 


101035 


96ORP303 


3 


4476. .4592 


3B 


a c o a aagct acc t agt agcagt ac t 


atg 


tga 


101036 


96ORF304 


3 


20586. .20702 


38 


cactetaagatagccaaagcaatac 


_9tg 


tga 


101037 


96ORP305 


3 


26356.. 28472 


38 


cggttaccaatgtgcttgatacgat 


ttg 


taa 


101038 


96ORP306 


-1 


24359. .24475 


38 


acttaaataaaagccgtatcgtqcc 


atg 


taa 


101039 


96ORP307 


-1 


20147. .20263 


38 


ttgtacetataegagttaactcetc 


att 


tag 


101040 


96ORP308 


♦2 


38158.. 38274 


38 


ccccgtatccactttctaagaaagc 


gtg 


tga 


101041 


96ORP309 


-2 


35149. .35265 


3B 


agctcgtttgtatcgtctttaacga 


aca 


taa 


101042 


96ORP310 


-2 


31423. .31539 


38 


gtaatatgattaggtcccctcttat 


ttg 


taa 


101043 


960RP311 


-2 


10438. .10554 


38 


cgect ttaaatcgttttaggtcact 


ate 


taa 


101044 


960RP312 


-2 


1390.. 1506 


38 


gag aacaacacaaa ca 1 1 aa c a a ca 


ate 


taa 


101045 


960RP313 


-3 


33051. .33167 


38 


acgtcctgtttctagatcgtaatac 


ata 


tag 


101046 


960RP314 


-3 


25194.. 25310 


38 


ageaaacegt caaaga taaca 1 1 ga 


ate 


taa 


101047 


960RP315 


•3 


6273. .6389 


38 


cattcttgctaacacgtcagattga 


ctg 


tga 


101048 


960RP316 


-3 


4281. .4397 


38 


ataattegtattcattaatcattaa 


att 


tag 


101049 


960R7317 


1 


2260. .2373 


37 


acgactccttttctcatatttcttt 


ata 


taa 


101050 


960RP31B 


2 


21230. .21343 


37 


atttcacaettttttagttgtctcc 


ttg 


taa 


101051 


96QRP319 


3 


18018. .18131 


37 




acg 


tag 


101052 


96ORF320 


3 


36972. .37085 


37 


attacagacatcctaaggqttcccg 


att 


taa 


101053 


96QRP321 


-1 


36302. .36415 


37 


ctcttgagctttttgacctaattto 


acc 


taa 


101054 


960RF322 


-1 


32606.. 32719 


37 


ccaeaagttatttctccagttcrat 


att 


caa 


101055 


960RP323 


-1 


11453 ..11566 


37 


ctaaaccgttcctttttatcaactc 


att 


tga 


101056 


960RF324 


-1 


7268. .7381 


37 




ata 


tga 


101057 


960RP325 


-2 


32347. .32460 


37 


1 1 actgeat t tgtatatggcgat aa 


acc 


tag 


101058 


960SP326 


-2 


24682. .24795 


37 


acgtttattacgctcataaagccat 


ata 


tag 


101059 


960RP327 


-2 










taa 


101060 


960RP32B 


-2 


21460. .21573 


37 


agagcactaat aeget 1 1 tgt t ct t 


ctg 


tga 


101061 


960RP32 9 


-2 


21208. .21321 


37 


gacttaacttcttegatattcatat 


acc 


z 3 a 


101062 


96ORP330 


-2 


18085. .18198 


37 






ta S 


101063 


960RP331 


-2 


B170..82B3 


37 


aettcgagacgtcgtetgtcteecc 




"2 


101064 


960RP332 


-2 


5971. .6084 


37 


eaat c cgt 1 1 c ccgt 1 1 1 c t cc t ag 




tag 


101065 


960RF333 


-3 


37632. .37745 


37 


accttgettaatcaagtcgtaacca 


act 


tga 


101066 


960RP334 


-3 


29628. .29741 


37 


ctgagctagtgccgtaaaatgtcat 






101067 


960RP335 


-3 


7164. .7277 


37 


ctagcggatotccgttttctagtaa 


acc 


taa 


101068 


960RP336 


1 


22903. .23013 


36 


gtaaaaaaagaeaatatgactatta 






101069 


960RP337 


1 


43258. .4336B 


36 


caattgacgtggtcactcttcaggt 




taa 


101070 


960RP338 


2 


12668 . .1277B 


36 


gaactggtggaatgggcatggaaca 


ate 


tag 


101071 


960RP339 


2 


28292. .28402 


36 


ttcactgctttaattcagttgceta 




taa 


101072 


960RF340 


2 


35396. .35506 


36 


1 1 cc t aa tgaaca caagt c aaegge 


att 


tga 


101073 


960RP341 


3 


2S426. .25538 


36 


act cgagaacaat t agaaaaagcaa 


"9 




101074 


960RP34 2 


-1 


40913 . .41023 


36 




ata 




101075 


960RF343 


-1 


39173. .39283 


36 


tgccacatcctagtgccaggattga 


ttg - 


.tfs 


101076 


960RF344 


-1 


37580. .37690 


36 




ata 


taa 


101077 


960RF34S 


-1 


31556. .31666 


36 


ggattattcttcctaataactccaa 


teg 


tga 


101078 


960RP34 6 


-1 


29972. .30082 


36 


ggctactcectatctaaaatataat 


ttg 


taa 


101079 


960RP34 7 


-1 


28787. .28897 


36 






tga 


101080 


960RP34 8 


-1 


21639. .21949 


36 


ttaaaatccgataaaataacattgc 


ctg 


_gga 


101081 


960RP34 9 


-1 


3647. .3757 


36 


caaaacctccgaagctacccagcqt 


teg 
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96ORP3S0 i -2 


40801. .40911 


36 


accaccccaatttcgcccatacqat 


3*9 


tag 


10X083 


960RP351 | -2 


38953.. 39063 


36 


tatcttttaaaattctcataataac 


ate 


taa 


101084 


960RP352 


-2 


31585. .31695 


36 


tagctgtcatcactagtatttttga 


ate 


taa 


10108S 


960RF353 


-2 


24550. .24660 


36 


atagtccgttttaccgcctcqtact 


att 


fa 


101086 


960R7354 


•2 


20083.. 20193 


36 


atcatcattttqatatttctcaaac 


ata 


tga 


101OB7 


960RF3S5 


-2 


991. .1101 


36 


gcatcttggcagtacgacgtaaaoc 


ate 




101088 


960RP3S6 


-3 


38148. .38258 


36 




att 


tga 


101089 


960RF357 


-3 


8790. .8900 


36 


tgaagtcatccagcgccaettttet 


ttg 


tag 


101090 


960RP3S8 


-3 


44S8. .4568 


36 


ttcataaaagtattctttgtagtat 


atg 


tag 


101091 


960RP359 


1 


4666. .4773 


35 


ttatcaaaatatacaacttaattaa 


ate 


tag 


101092 


96ORF360 


1 


11569. .11676 


35 


ataaatttaccgaacatgaaaatga 


att 


tga 


101093 


960RP361 


2 


6122. .6229 


35 


ggaaaacaaattgatgttgtaqtga 


ttg 


taa 


101094 


960RP362 


-1 


40418. .40525 


35 


t t egtagg tgr cat t act t ct c t aa 


teg 


tag 


101095 


960RP363 


-1 


34358. .34465 


35 


gttttgcttgatttcgatttgttga 


atq 


tga 


101096 


960RF364 


1 -1 


206 54.. 20761 


35 


ct at 1 1 ceactgat t ccccat c t aa 


atg 


tga 


101097 


960RP365 


- 1 


8423. .8530 


35 


tettttttagagttacgaggtttca 


att 


tag 


101098 


960RP366 


-1 


2402. .2509 


35 


t gaegt a tggcaaca tt 1 1 aga t ca 


ate 


taa 


101099 


960RP367 


-2 | 36607. .36714 


25 


aaaataaaaagccagtgccgaagca 


ctg 


tag 


101100 


96CRP368 


-2 


27061 . .27168 


35 


caaatcgtcctgcagegctcaataa 


ate 


tag 


101101 


960RP369 


-2 


26470. .26577 


35 


atgagttgttaaqtttaccccaaat 


ate 


taa 


101102 


96ORP370 


-2 


10227. .10434 


35 


ccgtgccatcttctcggtataaqta 


ata 


taa 


101103 


96CRP371 


-2 


8650. .8757 


35 


gggtaegggttgttactgttqatat 


ate 


taa 


101104 


960RF372 


-3 


14382.. 1448S 


35 


gttcttttaattqatctactqttaa 


att 


taa 


101105 


960RP373 


-3 


8151. .8258 


35 


atgtttgttagtctctgtgtagtct 


atg 


taa 


101106 


960RP374 


-3 


5007. .5114 


35 


aaacgac t c aaqt ggaacat tat t c 


ata 


taa 


101107 


960RP375 


2 


30563. .30667 


34 


cga t taga a at ct t t aaaaaaggac 


ttg 


tga 


101108 


960RP376 


-1 


19916. .20020 


34 


cctatgccaggtaatttgtcattaa 


att 


taa 


101109 


960RF377 


-1 


9236. .9340 


34 


cttttctgttagtaattgtttttaa 


ate 


taa 


101110 


S60RP378 


-1 


9026. .9130 


34 


actctttatctttagttgcttttaa 


ata 


tag 


101111 


960RF379 


-2 


28447. .28551 


34 


cttttgtgataataaagtttagtgc 


ttg 


tga 


101112 


960RP380 


-3 


40329. .40433 


34 


ccatttaccttcttgagatgttgga 


ttg 


cga 


101113 


960RP381 


-3 


39801. .39905 


34 


caaaaqatgaaggctttttecatac 


ttg 


taa 




960RP382 


-3 


33831.. 33935 


34 


atgttgt t tgt aacccga t taagt t 


ate 


tga 


101115 


960RF383 


-3 


336B7. .33791 


34 


gttattacgtcttaatacttgtgtt 


TO 


tag 


101116 


96QRF3B4 


-3 


13S30. .13634 


34 


tatacgcactagtactgateaetga 


_ttg_ 


taa 


101117 


960RF3B5 


-3 


3843. .3947 


34 


tttgattgattgttctagttaagaa 


att 


taa 


101118 


96QRF386 


1 


12256. .12357 


33 


agtcataaagaagttagcaatgtga 


ttg 


ta g 


101119 


960RF3B7 


2 


2207. .23CB 


33 


tccaagactctttaactgttaactt 


ate 


_tag__ 


101120 


960RP3B8 


2 


2519. .2620 


33 


attgttgaatttcgattgatctaaa 


* c 9 


tga 


101121 


960RP3B9 


2 


22517. .22616 


33 


agaagt aaaatgcgt aatgc 1 1 c ag 




tag 


101122 


960RP3 90 


2 j 


27302. .27403 


33 


1 1 ccaaaat tgggct aatagcgtag 


ctg 


taa 


101123 


960RP391 


2 


32384. .32485 


33 


act a aaa agg t cgagaaagccg tag 


*H3.. 


taa 


101124 


960RP392 


2 


39287. .39388 


33 


aaaaacggtactgtagtatcaatca 


atc 


_taa_ 


101125 


960RP393 


3 


18153. .18254 


33 


gtagtatatgccqactttqatttga 


at ? 


taa 


101126 


960RP3 94 


3 


24189. .24290 


33 


tcagaccctaacattaacaaaccag 


"S 


tga 


101127 


960RF395 


-1 


15266. .15367 


33 


tcgataatttgtatagcttgtttta 


atg 


tag 


101128 


960RF3 96 


-2 


32239- .323*0 


33 


ttttagtgaaagcatctagtgttga 


ata 




101129 


960RF397 


-2 


16123. .16224 


33 


ttatgtgtgcctatcatattaacaa 




tag 


101130 


960RP398 


-2 


13648. .13749 


33 ! tetttaactgaatqttqaataqcac 


"a 


taq 


101131 


960RF399 


-2 


109B7. .11088 


33 


acttccgtaggtattcttatatcaa 


ttg 


^ 


101132 


96ORP4 00 


-2 


3382.-3483 


33 


cttactggtaattcttcaaaattaa 


-tg 


taa 


101133 


96ORP401 


-3 


40794. .40895 


33 ! 


ccatatgatgtqaaaqtgtttaaat 


ttg 


taa 


101134 


96ORF402 


-3 


39978. .4CC79 


33 


acattcctaaatcacttgaacctaa 


att 


tga 


101135 


96ORF4 03 


-3 


38607. .38706 


33 


accttcagtgtaaaatcgacagcca 


acq 


ta ? 


101136 


96ORF404 


-3 


21288. .21369 


33 


cagacaccgtcttaaqtcccttcaq 


ata 


taa 
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Table 11 



SEQUENCE INFORMATION FOR PHAGES MATCHING WITH TABLE 1 
M32695 

Bacteriophage PM2 nuclease cleavage site 
gi|l66145|gb|M32695|BM2NCS (166145) 

(View GenBank report,FASTA repon^SN.l repon,Graphkal view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind III fragment 4 
gi|l66144[gb|M32693|BM24HlND3 [166144] 

(View GenBank report,FASTA rcport r ASN.l ieport,Graphieal view.l MEDLINE link, or I nucleotide neighbor ) 
M32693 

Bacteriophage PM2 Hind IK fragment 4 
gi|l66144|gb|M32693|BM24HIND3 [166144] 

(View GenBank report.FASTA reportASN.l repoit,Graphical view.l MEDLINE link, or. 1 nucleotide neighbor ) 
M32694 

Bacteriophage PM2 Hind III fragment 3 
gi|166l43igb|M32694[BM23HIND3 [166143] 

(View GenBank repotxFASTA reporVASN. 1 report,Gropbical view, or 1 MEDLINE link ) 
M26134 

Bacteriophage PM2 structural protein gene containing purine/pyrirnidine rich 
regions and anri-Z-DNA-IgG binding regions, complete cds 
gq289360|gb|M26134|BM2PROTIV (289360] 

(View GenBank report,FASTA repor\ASN.l report, Graphical view, I MEDLINE link, or 1 protein link ) 
J02452 

bacteriophage fl 3 ''terminal region ma 
gil2154O9|gbP02452!PFTrR3 [215409] 

(View GenBank report,FASTA rcportrASN.l reporuGraphical view, or 1 MEDLINE link ) 
AF020798 

Bacteriophage Chp t genome DNA, complete sequence 
gi|21776l|dbj|D00624|BCPl (217761) 

(View GenBank repon,FASTA reporCASN.l report,Graphical view.l MEDLINE link, 12 protein links, or 1 genome link ) 
X72793 

ClcOTdhim botulinum C phage BONT/C1, ANTP-139, ANTP-33, ANTP-17, ANTP-70 
genes and ORF-22 

gi|516171|emb|X72793|CBCBONT [316171] 

(View GenBank rcport,FASTA repcrt^SN.l report, Graphic*] view.l MEDLINE link, 6 protein links, or 4 nucleotide neighbors ) 
X51464 

Clostridium botulinum D Phage C3 gene for exoenzyme C3 
gi!14907|embpC5 14641CBDPE3 [14907] 

(View GenBank report, FAST A rcport^SN.l repon.Gnphical view.l MEDLINE link. 1 protein link, or 2 nucleotide neighbors ) 
D9O210 

Bacteriophage c-st (from C. botulinum) Cl-tox gene for botulinum CI neurotoxin 
g q2 l7780kJbjlD902 10ICSTC1TOX [217780] 

(View GenBank rcport,FA5TA rcpooASN. I rcport,Graphical view, 1 MEDLINE link, or I protein link ) • 
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S49407 

^^SS^SS^S: l€ PU ' h0$t " C bcta{ **m. type D, CB16. Gennrnic. 4087 at] 
giJ260238|gb|$49407|S49407 [260238] 

(View GenBank reponJFASTA repon,ASN.l report,Graphical view.l MEDUNE link, or I protein link ) 
X53370 

Bacteriophage phi29 temperature sensitive mutant TS2f981 DNA nolvmerase oene 
giJ15733|emb|X53370;POTS298 [15733J 

(View GenBank repooFASTA repooASN.l report,Graphical view.l MEDLINE link, I protein link, or 7 nucleotide neighbors ) 
X33371 

Bacteriophage phi29 temperature sensitive mutant TS2(24) DNA polymerase gene 
gi|1373||embjX5337l|POTS224 [15731] 

(View GenBank reportjASTA repon.ASN.1 rcport,Graphical view.l MEDLINE link. 1 protein link, or 7 nucleotide neighbors ) 
X05973 

Bacteriophage pht29 pro he ad RNA 
gi]l5680jembpC05973JPOP29PRO [15680] 

(View GenBank report^ASTA reportASN. 1 repon,Graphical vicw.2 MEDLINE links, or 4 nucleotide neighbors j 
V01 155 

Left end of bacteriophage phi-29 coding for 1 5 potential proteins Among 
these are the terminal protein and the proteins encoded by the genes 1 , 2 (sus), 3. and (probably) 4 
gi|15659|emb|V0U55(POP29B [15659] ^ • 

(View GenBank report.FASTArepon^SN.1 report,Graphical view.l MEDLINE link, 16 protein links, or 16 nucleotide neighbors) 
X73097 

Bacteriophage phi-29 left origin of replication 
gt|312l94|embpC73097(BP29ORIL [312194] 

(View GenBank repooFASTA report^SN. 1 report, Graphical view.l MEDLINE link, or 5 nucleotide neighbors ) 
M14430 

20 Bacteriophage phi-29 gene- 1 7 gene, complete cds 

gi|21532l|gbjM14430|P29O17A [215321] 

(View GenBank report,FASTA report^ SN. 1 report,Graphical view, 1 MEDLINE link, 6 protein links, or 6 nucleotide neighbors ) 
M14431 

Bacteriophage phi-29 gene- 16 gene, complete cds 
gi]2153l 9|gb|M1443 lfP29016A [2 153 19] 
35 <Y iew GenBank repoxt^ASTA rcportASN. 1 report,Graphical view, I MEDLINE link, 2 protein links, or 7 nucleotide neighbors ) 

M20693 

Bacteriophage phi-29 DNA, 3* end 
g iJ215343|gbjM20693fP29REPINB [215343] 

(View GenBank report,FASTA report T ASN.l report,Graphicat view,] MEDLINE link, or 4 nucleotide neighbors ) 
M21016 

40 Bacteriophage phi-29. DNA , 5 1 end 

gi{215342lgb|M21016|P29REPINA [215342] 

(View GenBank report,FASTA report^SN.l report,Graphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
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•207 

MI 2456 

Bacteriophage phi-29 genes 9, 1 0 and U encoding p9 tail, incomplete, plO 
connector, complete, and pi I lower collar, incomplete, respecrively 
gi|2153381sb|M12456|P29P9 [215338] 

10 (Vlcw GcaB ** «pcn,FASTA repon,ASN. 1 repcn.Graphical view, | MEDLINE link. 3 protein link*, or 2 nucleotide neighbors ) 

M 14782 

Bacillus phage phi-29 head morphogenesis, major head protein, head fiber 
protein, rail protein, upper collar protein, lower collar protein, pre-neck* 

appendage protein, morphogenesis(13), lysis, morphogenesis(15), encapsidarion genes, complete cds 
gil2l5323|gb|MI4782|P29LATE2 [215323] 
15 ( Vicw Geoflank repcaFASTA repon ASN. 1 report.Graphical view, 1 MEDLINE link. 1 1 protein links, or 11 nucleotide neighbors) 

M26968 

Bacteriophage phi-29 (from Bacillus subtilis) proteins pi delta- 1 genes, complete cds. and the susl(629> mutation 

giI34l558|gb|M26968|P29PiDlA [341558] n ' 

(View GcnBank report.FASTA repon.ASN.1 report,Grapbicai view. I MEDLINE link. 2 protein links, or 1 nucleotide neighbor ) 

20 ,02448 

Bacteriophage fl , complete genome 

gi|l6620l|gbjJ02448JFlCCG [166201) 

(View GenBank report,FASTA repon,ASN. 1 repcrt,Graphical view, 1 MEDLINE link, 10 protein links, 205 nucleotide neighbors 
or 1 genome link ) 

M24832 

0 - Bacteriophage 12 coat protein gene, partial cds 

d0 gill66228[gb(M24832|F2CRNACA [166228] 

(View GenBank report^ASTA report^SN. 1 repon.Graphical view, 1 MEDLINE lick, 1 protein link, or 4 nucleotide neighbors ) 

J02451 

Bacteriophage fd, strain 478, complete genome 
gi|2 15394|gbtf02451|PFDCG [215394] 

™ < Vicw GenBank report^ AST A report^SN. I repon,Graphical view,5 MEDLINE links, 10 protein links, 204 nucleotide neighbors 

JU or I genome link ) « u . 

M34834 " 
Bacteriophage ft replicase gene, 5' end 
gi|166139|gb|M34834|BFR*£GRA [166139] 

(View GenBank report,FASTA report^SN.l report, Graphical vicw.l protein link, or 9 nucleotide neighbors ) 

35 M38325 

Bacteriophage rr replicase gene, 5 1 end 
giJI66137|gb|M38325[BFRREGR [166137] 

(View GenBank report,FASTA report^SN. 1 report,Grapbical view, 1 protein link, or 9 nucleotide neighbors ) 
M35063 

Bacteriophage rr coat protein replicase cistron (R region) RNA 
40 gijl66134|gbjM35063|BFRRCRRA (166134) 

(View GenBank reportFASTA report,A5N. 1 report,Graphical view, 1 protein link, or 3 nucleotide neighbors ) 

S66S67 

alpha-atrial natriuretic factor/coat proteLvfiision polypeptide [human, 
bacteriophage fr, expression vector pFAN15, PlasmidSyntheticRecombinant, 510 nt] 
gil435742|gblS66567|S66567 [435742] 

45 ( Vfcw GenBank report,FASTA repon.ASK 1 report,Grapbical view,l MEDLINE link, 1 protein link, or 15 nucleotide neighbors ) 
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X15031 

Bacteriophage fr RNA genome 

gi| 1 507 1 lemblX 1 503 1 ILEBFRX [15071] 

(View GenBank report, FA ST A rcpon,ASN T . I repon.Graphica] vie*. 1 MEDLINE link. 4 prorein links, 9 nucleotide neighbors 
or 1 genome link ) 

US 1233 

Mus musculus neutralizing anti-RNA-bacteriophage fr immunoglobulin variable 
region light chain (IgM) mRN'A. partial cds 
gi|l277150Igb|U5l233|MMU51233 [1277150] 

{View GenBank report,FASTA report,ASN.i rcport,GraFhical view,) protein link, or 1669 nucleotide neighbor! ) 
U51232 

Mus musculus neutralizing ami-RNA-bacieriophage fr immunoglobulin variable region heavy chain (IgM) mRNA partial cds 
gi]l277l48|gb|U5l232|MMU51232 [1277148] F 
(View GenBank rcpon,FASTA reporWSN. 1 rcport,Graphical view, I protein link, or 1073 nucleotide neighbors ) 

U02303 

Bacteriophage Ifl. complete genome 
gil36762SOlsb|U023031B2U02303 [3676280] 

(View GenBank rcpon,FASTA report, ASN. 1 rcport,Graphical view, 10 protein links, or 1 genome link ) 

V00604 
Phage Ml 3 genome 

gill4939iemb|V00604|INMI3X (14959] 

(View GenBank repon.FASTA rcport^\SN. 1 report. Graphical view.l MEDLINE link. 10 protein links, or 205 nucleotide 
neighbors) 

A32252 

Synthetic bacteriophage Ml 3 protein III probe 
gi|l567340|emb|A32252lA32252 [1567340] 

(View GenBank report,FASTA repon,ASN.l report, or Graphical view) 
A32251 

Synthetic bacteriophage M13 protein III probe 
gi| 1 567339|emb|A3225 1|A3225 1 [ 1 567339] 

(View GenBank repon^ASTA report^SN.l report, cr Graphical view) 
M 12465 

Bacteriophage M13 mplO mutations in lac operon 
gi|215210|gb|Ml2465UM13LACMTJT [215210] 

(View GenBank repon,FASTA report t ASN.l report,Graphical view.l MEDLINE link, or 2 15 nucleotide neighbors ) 
M24177 

Synthetic Bacteriophage M13 (clone M13.SV.B12) SV40 early promoter region DNA 
gi|209416|gb|M24177|SYNSVBl2 [209416] 

(View GenBank repon.FASTA rcport^SN.l rcport,Graphical view,! MEDLINE link, or 1 nucleotide neighbor ) 
M24176 

Synthetic Bacteriophage M13 (clone M13.SV.B1 1) SV40 early promoter region DNA 
gi|2094!5|gb|M24176!SYNSVBll [209415] 

(View GenBank rcpon,FASTA rcportASN.l repon,Graphical view.l MEDUNE link, or 1 nucleotide neighbor ) 
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M24175 

Synthetic Bacteriophage M13 (clone MI3.SV.8) SV40 early promoter region DNA 
8i|208806|xb|M24 1 75|S YNM 1 3S V8 (2088O6J 

(View GenBank repon,FASTA repon.ASN.1 tepon,Graphical v.ew.1 MEDLINE link, or 242 nucleotide neighbor, ) 
M19979 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
gi|207813|gb|M19979|SYN33M13M [207813] P m * ap " VJJ 

(View GenBank repon,FASTA rcporUSN.l repon.Grapuicai view.l MEDLINE link, or 617 nucleotide neighbors ) 
M19565 

f5 Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasmid pHV33 

8i|2078O8|gblM19565|SYN33M13H [207808) P 

(View GenBank repon.FASTA rcport^SN. 1 repoaGraphical view.l MEDLINE link, or 567 nucleotide neighbors ) 
M 19564 

Synthetic hybrids; recombinant DNA from bacteriophage M 13 and plasmid pHV33 
8u207807|gb{Ml95W|SVN33Ml3G [207807] P P 

20 (V,CW GenBank rcpon.FASTA repon,ASN.l repon,Graphical view.l MEDLINE link, or 12 nucleotide neighbors ) 

M 19563 

Synthetic hybrids; recombinant DNA from bacteriophage M13 acd plasmid oHV33 
gil207806|gb|M!9563|SYN33M13F [207806] P 

(View GenBank repon.FASTA repon.ASN.1 report,Graphical view.l MEDLINE link, or 262 nucleotide neighbors ) 
M19561 

Synthetic hybrids; recombinant DNA from bacteriophage MlJ and plismid pHV33 
gi|2078W|gb|Ml9561|SYN33M13D [207804] P 

(View GenBank repon,FASTA report^SN.I report, Graphical view.l MEDLINE link, or 27 nucleotide neighbors ) 
M 19560 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid nHV33 
gi|2O7803|gb|M1956O|SYN33M13C [207803] opiMmiapHVH 
(View GenBank repon,FASTA repon^SN. I report,Graphical view, or t MEDLINE link ) 
M19559 

Synthetic hybrids; recombinant DNA from bacteriophage M13 and plasmid pHV33 
8 q207802|gb|M19559|SYN33M13B [207802] P ? 

(View GenBank report, FASTA report^SN.l rcport,Graphical view.l MEDLINE link, or 227 nucleotide neighbors ) 
35 M10568 

Bacteriophage M 13 replicarive form IL repUcaoon origin, specific nick location 
gi|2152201gb|Ml0568IM13ORIB [215220] ' 

(View GenBank reports ASTA repooASN. 1 report,Graphical view, 1 MEDLINE link, or 650 nucleotide neighbors ) 
M10910 

Bacteriophage Ml 3 gene D regulatory region and M13sjl mutant 
40 gii215209|gb|M109lOjM13IIREG [215209] 

(View GenBank rcpon JASTA report^SN. 1 repoaGraphical view.l MEDLINE link, or 72 nucleotide neighbors ) 

M38295 

Bacteriophage Ml 3 Haelll restriction fragment DNA 
gi|215208|gb|M38295|Ml3HA£IU [215208] 
45 ( Vie w GenBank report,FASTA report.ASN. 1 repoaGraphical view, or 67 nucleotide neighbors ) 
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5 

E02067 

DNA encoding a pan of Bacteriophage Ml 3 tg 127 
gi|2l70311|dbj|E020671E02067 [21703111 

{View GenBank repon.F AST A repon,ASN. I repon, or Graphical view) 
J02467 

10 Bacteriophage MS 2, comniete genome 

gii2l5232lgb|J02467|MS2CG [215232] 

« f gen^^7 P ° aFASTA rSP ° aASN1 rc P ort -Gra?hical vi2W .8 ME DUNE links, 4 prctein links. 20 nucleotide neighbors, 

AJ004950 

Bacteriophage PI ban gene 
75 gi|3688226jemb|AJ01!592|BP 101 1592 (36882261 

{ View GenBank repon, FAST A repon.ASN.1 rcecn,Graphical view, or 1 protein link ) 

U88974 

Bacteriophage Pi. structural lyric rransglycosyUse (orf47), pep44b (orf44b), 
pcp44a (orf44a), and pep43 (orf43) gccei, complete cds; and pep42 (orf42) gene, partial cds 
8i)266I099|gb|AF035607|AF035607f266l099) .pwnucos 
20 < Vicw GenBank repon,FASTA repcrt^SN. I report,Graphical view,5 protein links, or 1 nucleotide neighbor ) 

AJ0O0741 
Bacteriophage Pi darA operon 
giJ2462938|emb|AJ0OO74 UBPAJ764 1 (2462938} 

(View GenBank repon,FASTA repcaASN. 1 repon,Graphi:al view, 1 MEDLINE link. 10 protein links, or 3 1 nucleotide neighbors 
25 X01828 

Bacteriophage P I recombinase gene cis 
gill5133|emb|X01828|MYPlCIN [15133] 

(View GenBank report,FASTA repon^SN.l rcport,Graphical V1 ew,| MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
X98146 

Bacteriophage PI DNA sequence around the Op88 operator 
30 gi|1359513|emb|X98146|BP10P880P (1359513] 

(View GenBank report,FASTA reporuASN. 1 report,Graphical view, or 1 nucleotide neighbor ) 

S61I75 

imml operon: icd-cell division represser, antl^andrcpressor (promoters 
P5la, P51b) (bacteriophage Pi, Genosi:, 728 nt] 
gq385908Jgb|S6U75|S61l75 [38590S1 
35 * Vicw GenBank repooFASTA reporuASN. 1 report,Grapnical view, 1 MEDLINE link, or 3 nucleotide neighbors ) 

X87824 
Bacteriophage PI gene 26 
gi!86iI64|emb|X87824|XXBPIG26 (861164] 

(View GenBank rcpon,FASTA reportASN.l report,Grapbical view, or 1 protein link ) 

40 X 15638 

Phage P I DNA for lytic replicon containing promoter P53 and two open reading frames 
gi| 1 5735|emb|X 1 563 8|PP 1 LREP (1 573 5} 

(View GenBank rcpon,FASTA rcportASN.l report,Grapbical viewj MEDLINE link, 3 protein links, or 24 nucleotide neighbors 
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XI7SI2 

Bacteriophage P I DMA for immunity region imml 
gi|l5479|ctnb|X17512jPllMMUNIY [15479] 

(View GenBank report, FAST A rcport,ASN. 1 rcport,Graphical view,2 MEDLINE links, or 4 nucleotide neighbor! ) 
10 X16005 

Bacteriophage PI c 1 gene for Plcl repressor protein 
gi|!5477|emb|XI6005|PlCI (15477] 

(ViewGenBarJcrepcrt.FASTArepon.ASN.1 rcpoaGraphical view.l MEDLINE link. I protein link, or 3 nucleotide neighbors ) 
X03453 

. c Bacteriophage P I ere gene for recombinase protein 

73 gi|l5135|errib|X03453|MYPlCRE (15135] 

(View GenBank repor^ASTA reporuASN.l report,Graphical view.l MEDLINE link. 2 protein links, or 12 nucleotide neighbors J 
X06561 

Bacteriophage PI cl gene 5-region 
gill5128|emh|X0656l|MVPlCl (15128] 
2q (View GenBank rrponJASTA reporUSN.l rcportGrapbical view.l MEDLINE link. 4 protein links, or 6 nucleotide neighbors ) 

V0I534 

Bacteriophage PI genome fragment (IS2 inscrrion spot). This regions contains 

f ™ ™ duig fnma ud » known " iMcrtion hot spot for IS2 insertion sequences 

giI15U8iemb|V01534|MYOVPl(l5U8] sequences 

(View GenBank reponJFASTA rcporUSN.l repc^Graphical view,| MEDLINE link. 4 protein links, or 3 nucleotide neighbors | 
25 X56951 

Bacteriophage PI gene 10 
gi|406728jemhpC5695l(BPPIGPI0 [406728] 

(View GenBank reportfASTA reportASN.I repoaGraphical view,2 MEDLINE links. 3 protein links, or 1 nucleotide neighbor ) 
KO2380 

Bacteriophage PI replication region including repA, parA, and parB graea and 
30 uttA.incB.andmcCmcormMQl>mtydeternn^ 
gi|215652|gb|KO2380|PPIREP [215652] 

(View GenBank report.FASTA reportASN.l report.Graphical view,5 MEDLINE links, 4 protein links, or 8 nucleotide neighbors ) 
X87674 

Bacteriophage PI lydA & lydB genes 
gii974763|emb|XB7674[BACPlLYD [974763] 
35 (View GcnBiak "PW/ASTA repon^SN.l report.Graphicai view r l MEDLINE link; 2 protein links, or 2 nucleotide neighbors ) 

X87673 
Bacteriophage PI gene 17 
gi]97476l|etnb|X87673|BACPI17 [974761] 

(View GenBank reportFASTA report^SN.l report,Grapbical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
40 M166I8 

Bacteriophage PI cl repressor binding sites 
gi|215600|gb|Ml6618IPPICl [215600] 

(View GenBank report^ ASTA repcoASN.l repoaGraphical view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
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SEG PPICIN 

(VltW ° enBank rcpcr,ASN,l repon,Graphical vicw.i MEDLINE link. 1 protein link, or 4 nucleotide neighbors > 

*0 K03I73 

B USS?X P1 C inverlibIc elBB ««. n*i end, and cixR recombination site 
g$2l3606|gb|K03mjPPlON2 (215606] 

(View GenBank report.FASTA report^SN. 1 report, or Graphical view) 
215605 

« Sttffiim gJSSff - ** * — - d 5 ' - ^ invenible element 

(View GenBank repcrt,FASTA report,ASNl repon, or Graphical view) 
M25470 

Bacteriophage PI tail fiber protein gene, complete cds 
gii341349|gb|M254701PP|7TPR (341349] 

M34382 

Bacteriophage P I sim region proteins, complete cds 
gil215661|gb(M343S2|PPlSIM (213661J 

(View GenBank report,FASTA reportASN.l report,GraphicaJ view.l MEDLINE link, or 2 protein links ) 
M81956 

25 Bacteriophage P 1 R protein (R) gene, complete cds 

gi|21565%b|M81956|PPlRP (215658] 

(View GenBank report^ A ST A report^SN.l repon,Graphicel view,! MEDLINE link, 2 protein links, or4 nucleortde neighbors , 
M37080 

Bacteriophage PI mini-Pi plasmid origin of replication 
giJ215657|gb|M37080iPPlREPOR (215657] 
30 (View GenBank reporvFASTA reporUSN.l report,Graphical view, 1 MEDLINE link, or 46 nucleotide neighbors ) 

M27041 

Bacteriophage PI ref gene, complete cds 
giJ2l56JO|gb|M2704UPPlREF (215650] 

(View GenBank repcrt,FASTA report^SN.l repoaGraphical view.l MEDLINE link, I protein link, or 1 nucleotide neighbor ) 
35 L0140S 

Bacteriophage PI partition protein (parB) gene, 3' end . 
gil2l5642|gbJL0l408|PPlPARB (215642} 

(View GenBank rcpcrt,FASTA rcporUSN.l report,Graphical view.l protein link, or 41 nucleotide neighbors ) 

SEG_PP1PAR 
Bacteriophage miniplasmid PI parA gene, 5' end 
40 Bi|215639fgb||SEG_PPlPAR [215639] 

(ViewGenBankreiort,FASTArepor^ASN.l report.Graphical view.l MEDLINE link, 2 protein links, or 48 nucleotide neighbors ) 
M36425 

Bacteriophage miaiplasraid P 1 parB gene, 3' end 
gi|2!56J8|gb|M36425|PPlPAR2 [215638] 

(View GenBank reportMSTA repooASN. 1 report, or Graphical view) 
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M36424 213 
Bictcriophagc rruniplasmid P 1 par A gens, 5' cod 
gil215637igb!M36424iPPlPARl [215637] 

(View GeaBank repoaFASTA repoaASN. 1 report, or Graphical view) 
MI1129 

Bacteriophage PI nuniplasrnid origin of replication region 
gi|2l5632|gb!Min29|PP10RIM [215632] 

(View GenBank report^ AST A repoaASN. I repoaGraphical view.l MEDLINE link, 1 protein link, or 43 nucleotide neighbors ) 
M25414 

Bacteriophage P! cl repressor binding site, operator 88 (Op88) 
gi|2 1563 ligblM254 14|PP10P88A [2 1563 1 ] 

(View GenBank repoaFASTA repoaASN. 1 repoaGraphical view.l MEDLINE link, or 3 nucleotide neighbors.) 
M254I3 

Bacteriophage Pi c 1 reprcisor binding site, operator 68 (Op68) 
gi!215630|gb|M25413|PPlOP68A [215630] 

(View GenBank report t FASTA repoaASN. 1 repoaGraphical view, or 1 MEDLINE link ) 
M254I2 

Bacteriophage P 1 c 1 repressor binding site, operator 2 1 (Op2 1) 
gij2l$629|gb|M25412|PP10P21A [215629] 

(View GcnBank repoaFASTA report, ASN.l report, Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
M10510 

Bacteriophage Pi recombination site loxR 
gi}215628|gb|M10510JPPlLOXR [215628] 

(View GenBank repoaFASTA repoaASN. 1 report,Graphical view, 1 MEDLINE link, or 1 nucleotide neighbor ) 
MI0287 

Bacteriophage PI loxP X loxP recombination site 
gi|215627|gb|Ml0287[PPlLOXPX [215627] 

(View GenBank repoaFASTA repoaASN. 1 repoaGraphical view, 1 MEDLINE link, or 1 3 nucleotide neighbors ) 
Ml 0494 

Bacteriophage PI recombination site loxP 
gi|215626|gb|M104941PPlLOXP [215626] 

(View GenBank repoaFASTA repoaASN. 1 repoa Graphical view. 1 MEDLINE link, or 1 34 nucleotide neighbors ) 
M10511 

Bacteriophage P I recombination site loxL 
gi|2l5625!gb|MI05Il(PPlLOXL [215625] 

(View GenBank repoaFASTA repoaASN. 1 repoaGraphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
M10512 

Bacteriophage Pi recombination site loxB 
g i|2l5624|gb|M105l2|PPlLOXB [215624] 

(View GenBank repoaFASTA repoaASN. 1 repoaGraphical view, or 1 MEDLINE link ) 
M10145 

Bacteriophage PI genome fragment with recombination site loxP 
gi|215623|gb|M10l45]PPICREX [215623] 

(View GcaBank repoaFASTA repoaASN. 1 repoaGraphical view.l MEDLINE lick, or 21 nucleotide neighbors ) 



WO 00/32825 



PCT/IB99/02040 



M13327 2,4 



Bacteriophage PI Cin recombinase activated crow over site, junction IV clone nSwm* 
gi|2l5622|gb|M13327|PPICN26IV [215622] . pSHU26 

(View GenBank rcportjASTA report, AS N. I report,Gra ? hical view, 1 MEDLINE link, or 7 nucleotide neighbor, ) 
10 MI3325 

Bacteriophage PI Cin recombinase activated cross over site, junction II c'ons nswm* 
gq2I5«2I|gb|MI3323!PPICN26H [215621) J ' 

(ViewGenB.nkreponjASTArepon^SN.1 rcpon.Grapbicai view. 1 MEDLINE link, or 1401 nucleotide neighbors ) 
M13323 

D *f Kricphage Pi Cin recombinase activated cross over site, junction IV done oSHU25 
15 gfl215«0|gb|M13323|PPICN25IV(2l5620J ' . ciooc p5HI325 

(View GenBank report,FASTA report^SN. 1 report,GraphicaI view, I MEDLINE link, or 7 nucleotide neighbors ) 
M 13321 

B -!f!7! ophaBe PJ Cin "combinase activated cross over site, junction II, clone oSHD2S 
g42156l9|gb|M1332UPPICN25II [2156I9J P5 "°" 
2Q (V,CW ° eaB " k 'WFASTA reportASN. 1 repon.Graphical view.l MEDLINE link, or 1058 nucleotide neighbors ) 

MI3324 

B ^ < ?]^v 8 .f recombina * activated cross over site, junction I. clone pSHI326 

gi)2156l8lgb|Ml3324|PP!OR26I [215618] ^ 

(View GenBank repcrt^ASTA reporUASN. 1 report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M13319 

( V,ew GenBank reportjASTA reportASN. 1 repoaGraphic.l view.l MEDLINE link, or 7 nucleotide neighbors ) 
MI3320 

Bacteriophage PI Cin recombinase activated cross over site, junction 1 clone nSHms 
giJ215616|gb|M1332QiPP!CIN25ir2i5«16] c, juncnon l. clone pSHI325 

30 ( Vlcw GcnBaak «WFASTA reportASN. 1 report,Graphical view, I MEDLINE link, or 7 nucleotide neighbors ) 

M 133 18 

^SeiSS /,lf ta recombi,I « e activated cross over site, left junction, clone P SHI324 
g02156l5|gbJM13318IPPlCIN24L [215615] 

(View GenBank report^ASTA repmtASN. l repon,Graphical view, I MEDLINE link, or 1370 nucleotide neighbors ) 
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Bacteriophage PI Cin recombinase activated cross over site, right junction, clone pSHI323 ■ 
g«215614Jgb|MI33i7jPPICIN23M [215614] ^ 

(View GenBank reponJASTA report^SN.l report.Graphical view.l MEDLINE link, or 1055 nucleotide neighbors ) 



M 133 16 



Bacteriophage PI Cin recombinase activated cross over site, left junction, clone oSHD23 
gt|2156!3|gb|M13316|FPiaN23L [215613] P 

(View GenBank repoaFASTArepooASN.l repon.Graphical view, 1 MEDLINE link, or 7 nucleotide neighbors ) 



M13315 



Bacteriophage PI Cin recombinase activated cross over site, right junction, clone dSHI322 
gi]215612|gbJM133l5|PPlCIN22Rr215612) J P * 

(View GenBank rcponJFASTA report,ASN.l rep on. Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
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M133I4 

Bacteriophage PI Cin recorabinase activated cross over site, left junction, clous p$HI322 
gil 2 1 56 1 1 IgbJM 1331 4|PP ICIN22L [2 1 56 11 J 

(View GenBank reportFASTA report^SN.l report,Graphical view.l MEDLINE link, or 1401 nucleotide neighbors ) 
M133I3 

Bacteriophage P 1 Cin recombinase activated cross over site, right junction, cioae pSHD2 1 
g q215610|gb|MI33l3|PPlClN2lR (215610) 

(View CenBank report,FASTA report.ASN.1 report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
MI3312 

Bacteriophage PI Cin rccombinase activated cross over site, left junction, clone pSH1321 
gq215609|gb{M13312|PPlCIN2IL [215609] 

(View GenBank report,FASTA rcportASN. 1 repon,Graphical view,} MEDLINE link, or 1058 nucleotide neighbors ) 
M16568 

Bacteriophage Pi c4 repressor gene, complete cds 
gi|21S603|gb|M16568|PPIC4 (215603) 

(View GenBank report.FASTA repoOASN.l reportGraphical view.l MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
M13326 

Bacteriophage PI Cin rccombinase activated cross over site, junction in, clone pSH1326 
gil2!5602|gblMl3326JPPlC26in [215602] 

(View GenBank report,FASTA rcrxwtASN. 1 repcrt,Graphical view. 1 MEDLINE link, cr 11 92 nucleotide neighbors ) 
M 13322 

Bacteriophage PI Cin rccombinase activated cross over site, junction HI, clone pSKI325 
.. gi|215601|gb|M13322|PPlC25m [215601) 
(View GenBank reportJASTA report T ASN. 1 repoaGraphical view, 1 MEDLINE link, or 123 1 nucleotide neighbors ) 

J05651 

Bacteriophage PI modulator protein (bof) gene, complete cds 
gi)215598|gb|J056Sl[PPlBOFYl (215598] 

(View GenBank reponJFASTA rcpcuXASN.l report,Graphical view, I MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
M33224 

Bacteriophage P I regulatory protein (bof) gene, complete cds 
gi!215596|gbjM33224|PPlBOFFO [215596] 

(View GenBank reportFASTA reportASN.l report,Grapbical view, I MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
M 10288 

E.coli/bacteriophage PI loxR recombination site 
gi]l46647|gb|M10288[ECOLOXR [146647] 

(View GenBank report, FA ST A rcporUASN. 1 report,Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
M 10289 

Exoli/bacteriophage PI loxL recombination site 
gt|!46646(gb|M10289{ECOLOXL [146646] 

(View GenBank rcport,FASTA reportASN.l report,Grapbical view, I MEDLINE link, or 2 nucleotide neighbors ) 
M 10290 

E.coli loxB site, which can recombine with bacteriophage PI loxP site 
gi|l46645igb|M10290|ECOLOXB [146645] 

(View GenBank report,FA5TA reporUkSN.l repon,Grapbjcal view, t MEDLINE link, or 2 nucleotide neighbors ) 
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M 10287 

Bacteriophage P I loxP X loxP recombination site 
giI2 1 5627|gb|M 1 0287|PP I LOXPX [2 15627} 

(View GenBank repon,FASTA report,ASN. 1 repoaGraphical view.l MEDLINE link, or 13 nucleotide neighbors ) 
M74046 

Bacteriophage P I pacA and pacB genes, complete cds 
gi|2l5634|gb|M74C46|PPlPACAB [215634] 

(View GenBank repoaFASTArcpoaASN.l repoaGraphical view, I WEDLINE link, or 2 protein links ) 
M95666 

15 Bacteriophage P I gene 1 0, doc and phd genes, complete cds 

gi|463276|gb|M95666|PP!PHDDOC [463276] 

(View GenBank reportFASTA repon.ASN. 1 repoaGraphical v ie w.2 MEDLINE links, 4 protein links, or 1 nucleotide neighbor ) 
M25604 

20 < V * W °«B«k report,FASTA repooASN.l repoaGraphical view, I MEDLINE link, or 8 nucleotide neighbors ) 

V00643 

first half of the phage Q-beta gene for coat protein 
gqi5088|emb|V00643|LEQBET (15088) 

(View GenBank report, FA STA rcpoOASN I repoaGraphical view.l MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
25 M25167 

Bacteriophage Q-beta RNA fragment recovered com replicase bindinc eomnlex 
gi|556362|gb|M25167|PQBREPUCB [556362] 

(View GenBank report,FASTA repooASN. 1 repoaGraphical view.l MEDLINE link, or 2 nucleotide neighbors ) 
M24876 

Bacteriophage Q-beta replicase RNA, 5* end 
30 gtlJ56360|gb|M24876|PQBREPLICA [556360] 

(VWW GtnB * nk «Pon,FASTA rcpooASN. 1 repoaGraphical view.l MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
M25444 

Synthetic bacteriophage Q-beta DNA 

gi|2091 18|gb|M25444|SYNPQBTERM [2091 18] 

(View GenBank report.F ASTA repooASN. 1 repoaGraphical view, 1 MEDLINE link, or 8 nucleotide neighbors ) 
M25463 

Bacteriophage Q-beta self-replicating tnicrovariant (+) RNA 
gi|532489|gb|M25463|PQBMVSRRNA [532489] 

(View GenBank report.FASTA repooASN. 1 repoaGraphical view, or 1 MEDLINE link ) 
M25014 

Bateriophage Q-beta RNA replicase gene, 5 'end, and maturation protein gene, 3' end 
40 gi|294316jgb|M25014jPQBREPLC (294316) 8 

(Vww GenBank report,FASTA repooASN.l repoaGraphical view.l MEDLINE link. 2 protein links, or 2 nucleotide neighbors ) 

M25065 

Bacteriophage Q-beta RNA sequence with putative stem loop 
gi|2943l5|gb|M25065|PQBLOOP [294315] 
45 (Vicw GeoBwk rcpon,FASTA repooASN.l repoaGraphical view.l MEDLINE link, or 3 nucleotide neighbors) 
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M 10265 

Bacteriophage Q-beta RNA molecule with the abUity co replicas exncellularly 
gi]2 1 5726|gb|M10265!PQ3RNA (2 15726] ,y 

(View GenBank repon.FASTA repon,ASN.] repon.Graptical view, , MEDLINE link, or 8 nucleotide neighbors ) 
M24815 

Bacteriophage Q-beta specified replicase lubuait RNA 
gi|215725|gb|M248l5|PQBREPL [215725] 

(View GenBank rcpon,FASTA reporUSN. 1 report,Graphic.l view, 1 MEDLINE lint or 4 nucleotide neighbor, ) 
M25461 

15 Bacteriophage Q-beta plus-strand RNA. 5' terminus 

gi|2iS724|gb|M2546l|PQBPS5E [215724) 

(View GenBank repon.FASTA repon,ASN. I report, or Graphical view) 
M25462 

Bacteriophage Q-beta plus-strand RNA, 3' terminus 
gij2l5723jgbjM25462|PQBPS3Epi5723] 

(View GenBank repon.FASTA repoaASN. l rcpon,Graphical view, or 8 nucleotide neighbor, ) 
M24871 

Bacteriophage Q-beta nanovariant WSIH RNA 
giJ2 1 5722|gb|M24 87 1 IPQBNVWS1C [2 15722J 

(View GenBank report^ A ST A repon^SN.l report,Graphical view.l MEDLINE link, or 2 nucleotide neighbor, ) 
M24870 

Bacteriophage Q-beta nanovariant WSII RNA 
giJ21572l|gb|M24870|PQBNV\VSIB (215721] 

(View GenBank repoaFASTA repooASN.l report, Graphic.! view.l MEDLINE link, or 2 nucleotide neighbor, ) 
M24869 

Bacteriophage Q-beta nanovariant WSI RNA 
gi|2 1 5720|gb|M24869|PQBNVWSIA (2 15720] 

(View GenBank repon.FASTA reporvASN.l report.Graphicai vjew,l MEDLINE link, or 2 nucleotide neighbors ) 
M 10495 

Coliphage Q-beta MDV-1(+) RNA 
gi|215719|gb|MI0495(PQBMDVlA [215719] 

(Vtew GenBank report,FASTA repon.ASN.1 report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
35 .. J02484 

bacteriophage qbeta coat protein cistron first half 
gil2157l71gb|J02484|PQBCP5 [215717] 

(View GenBank repon,FASTA report^l report,Graphical view.l MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
M37754 

A/x Bacteriophage Q-beta minus soand RNA, 5* terminus 

40 8iJ2I5716|gb|M57754(PQBBMS5E [215716] 

(View GenBank repon.FASTA repoaASN.l repon.Grapbicai view, or 8 nucleotide neighbors ) 

M24297 

Bacteriophage Q-beta 5'*terminal region of the minus strand 
gi|2l5715|gb|M24297|PQB5END [215715] 
45 (VWW GenBack «P«tJASTA repor^ASN. 1 repon,Graphical view, I MEDLINE link, or 8 nucleotide neighbor, ) 
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MI0695 , )S 
Bacteriophage Q-beta, MDV- 1 RNA 
gi|2157l4tgb|M10695|PQBiIR [213714] 

{View GenBank report.FASTA report,ASN. I repcn.Grsphicai vie MEDLINE links, or 1 2 nucleotide neighbors ) 
M24827 

B acteriophage R 1 7 A prote in g ene, 3' end 
gil216078!gb|M24827|Rl7RNACIS (216078J 

(View GenBank report.FASTA reporUiSN. 1 report,Graphical view. 1 MEDUNE link, or 5 nucleotide neighbors ) 
M24829 

Bacteriophage R17 coal protein gene, 5' end 
gi|2l6075|gb|M24829jRl7CP5 {216075] 
15 ( View Gc nBwk report.FASTA repon.ASN. 1 report,Graph!cal view, ! MEDLINE link, or 5 nucleotide neighbors ) 

J02488 

bacteriophage rI7 rna synthetase iniriarioa site 
gi|2!6080lgb|J02488{R17RNASYN [216080) 

(View GenBank repon,FASTA report^SN. 1 repoaGraphical view.3 MEDLINE links. 2 protein links, or 6 nucleotide neighbors ) 
20 J02487 

bacteriophage r 1 7 coat protein initiation site 
gi|2»6073|gb|J02487|R17COATP [216073) 

(View GenBank report,FASTA reportASN. 1 rcpon,Graphical view, or 1 MEDLINE link ) 
J02486 

bacteriophage r 1 7 a protein initiation site 
25 gt|2l607!|gb|J02486|Rl7APROT [216071) 

(View GenBank report,FASTA reportASN.l report,Graphical view, or 1 MEDLINE link ) 

M24826 

Bacteriophage R 1 7 coat protein RNA fragment 
gi|216077|gb|M24826|Rl7CPRAA [216077} 

(View GenBank report^ASTA reportASN.l report,Graphical view.l MEDLINE link, or 7 nucleotide neighbors ) 
M24296 

Bacteriophage R 17 3^terrninal fragment A RNA 
gi|216070lgb|M24296|R173TFA [216070] 

(View GenBank report.FASTA reportASN.l report,Graphical view.l MEDLINE link, or 9 nucleotide neighbors ) 
ITFN 

35 structure refinement for a 24-nucleotide ma hairpin, mar, minimized average 

structure ribonucleic acid, hairpin, bacteriophage rl7 mol id: I; molecule: rl7c; chain: null: engineered: ves 
gi|l942336jpdb|lTFN|[1942336J " 7 

(View GenBank rcportFASTA rejxa%ASN. 1 report,Graphical view, or 1 structure link ) 

IRPEA 

ma tf'-dtgpgpgpipcpupgpapcpgpapupcpapcpgp cpapgpupcpupapu-3') (24-merma 
40 hairpin coat protein binding site for bacteriophage rl 7) (nmr, minimized average smxecure) 

gi|142I020jpdb|IRHTj (1421020) 

(View GenBank report,FASTA report^SN. 1 report,Graphical view, or I structure link ) 
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M14428 

Bacteriophage S13 circular DNA, complete genome 
gi|216089(gblMl4428jS!3CG [216089} 

«l7enoS r ) P ° aFASTA rCp0aASR1 ^Graphical view,2 MEDLINE links, 12 protein link,, 26 nucleotide neighbor,. 
J05393 

Bacteriophage Tl DNA N-6-adeiunc-methyltraosferase (M.T1) Rene, complete cds 
gi|1661631gb|J05393|BTlNAMTA [166163] * 

(View Genflaok reportFASTA repon^SN. 1 repon,Graphical view, 1 MEDLINE link, or 2 protein links ) 
L46845 

Bacteriophage T2 frd3. frd2 genes, comclete cds 
gij95 1 387|gb|L46845|PT2FRD32G (95 1387] 
(View Genfiank repoaFASTA report^SN.l repoaGraphical view,2 protein links, or 17 nucleotide neighbors ) 
L436U 

Bacteriophage T2 fibritin (wac) gene, complete cds 
20 gi!903869|gb|L436U|PT2WAC (903869] 

(View GenBank reportFASTA rcporUSN. I repoaGraphical view, 1 protein link, or 4 nucleotide neighbors ) 

M24812 

Bacteriophage T2 secondary structure RNA sequence 
gi|215796jgb|M24812|PT2RNA [215796] 

(View GenBank report^ ASTA repooASN.l report,Grapbical view, 1 MEDLINE link, or 4 nucleotide neighbors ) 
M22342 

Bacteriphage T2 DNAKadcnine-N6)metbyltransferase (dam) gene, complete cds 
gi|213792|gb]M22342|PT2DAM (215792] mp etc cos 

(View GenBank reponJFASTA rcport^SN.l repoaGraphical viewj MEDLINE link, 1 protein link, or 2.nudeoudc neighbors ) 
S57515 

orf 61.2 {intergenic region between 41 and 61} (bacteriophage T2, Genomic. 323 ntl 
gi|2985241gb|S575l5|S57515 [298524] " ' J 

(View GenBank report,FASTA reportASK I repon,Graphical view, 1 MEDLINE link, or 1 protein link ) 

X05312 

Bacteriophage T2 gene 38 for receptor recognizing protein 
gi|15!97}embjX05312jMYT2G38 (15197) 
35 ( Vicw GenBank repooFASTA repor^ASN. I rcport.Grapbical view, I MEDLINE link, or 1 protein link ) 

X04442 

Bacteriophage T2 gene 37 for receptor recognizing protein 
gi|I5!95[einb|X04442|MYr2G37 [15195] 

(Vicw GenBank report,FASTA rcporUSN. I report. Graphical view, I MEDLINE link, or I protein link ) 
X12460 

Bacteriophage T2 gene 32 mRNA for single-stranoed DNA binding protein 
gilf5l92Jemb|XI2460|MYT2G32 [15192] 

(View GenBank report,FASTA repon^SN.l report,GraphicaJ viewj MEDUNE link. 2 protein links, or 14 nucleotide neighbors ) 
X5T797 

Bacteriophage T2 gene for gpl2 
gt|l4875|emb|X56555|BT2GPl2 (14875) 

(View GenBank reportJASTA report,ASN. I report,Graphical view, 1 protein link, or 2 nucleotide neighbors ) 
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X01755 

Bacteriophage T2 tail fiber gene 36 

gi|l 5 1 89|emb|X0 1 755]M YT2F36 [15189] 

(View GenBank reportfASTA reporvASN.l repoaGraphkal viewj MEDLINE link. 2 protein links, or I nucleotide neighbor | 
Ml 4784 

gSS 

(V.ew GenBank reportfASTA report.ASN.1 report,Graphical view.l MEDLINE link. 9 protein links, or 10 nucleotide neighbor, ) 
SEG_PT3RNA?OL 
15 Bacteriophage T3 RNa polymerase III gene. 5' end 

gi|7l0559|gb[JSEG_PT3RNAPOL (710359) 

(View GenBank repon,FASTA rcpo«,ASN. I report,Graphical view.l MEDLINE link. 2 protein links, or 2 nucleotide neighbor, , 
M22610 

Bacteriophage T3 RNA polymerase 01 gene, 3' end 
gi|340722|gbfM226l0|PT3RNAPOL2 [340722] 
20 ( View G****** 'epoaFASTA reportASN. 1 report, or Graphical view) 

M22609 

Bacteriophage T3 RNA polymerase ID gene, 5' end 

gi[340721|gb|M22609|PT3RNAPOLl (340721] 

(View GenBank reportFASTA rcpon^SN .1 report, or Graphical view) 

X05O31 

Bacteriophage T3 gene region 1-2.5 with primary origin of replication 
gill5719(emb|X05031!POT3ORI (15719] * 

(View GenBank report/ A ST A repon^SN.l report,Graphica] view, l MEDLINE link. 1 1 protein links, or 5 nucleotide neighbors ) 
X03964 

Bacteriophage T3 early control region pes. 308-810 from genome left end 
gill57l8!emb|X03964|POT3EP (15718] 

(View GenBank reponJASTA rcpooASRl report.Grapbical view.2 MEDLINE links, or 20 nucleotide neighbors ) 
X17255 

Bacteriophage T3 gene 1 to gene 1 1 
gi|l5682|erab|X17255[POT3l 11G (15682] 

oVucnome^T^^ reP0It ^ SN ' 1 Te P 0 ^ Gn ^ 1 ™>< MEDLINE links. 36 protein links, 17 nucleotide neighbors, 

X15840 
Phage T3 gene 10 

ggl5625|emb|Xl5840[PODT3Gl0[15625] 

(View GenBank reportFASTA rcport^SN. 1 report, Graphical view.l MEDLINE link, or 3 nucleotide neighbors ) 
X02981 

40 Bacteriophage T3 gene 1 for RNA polymerase 

gi|1556l|emb|X02981|PODOT3P (15561] 

(View GenBank reponJASTA rcporUSN.l repon.Graphical vicw.l MEDLINE link. 1 protein link, or 3 nucleotide neighbors ) 
/02503 

bacteriophage 0 5* end, tenninaUy redundant sequence (trs) 
A , 8>l2l58l6|gb|JO2503|PT3TRSl (215816] 

45 (View GenBank reponJASTA reporUVSN.l report, or Graphical view) 
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SEG_PT3TRS 

bacteriophage i3 5' end. terminally redundant sequence (trs) 
gi|215818|gbftSEG_PT3TRS [215818] 

(View GenBank report. FASTA repon^SN. 1 report,Grapbical view, or 1 MEDLINE link ) 
J02304 

bacteriophage U 3' end, terminally redundant sequence (its) 
gq2l58l7|gb|J02504|PT3TRS2 [215817] 

(View GenBank report^ASTA report^SKl report, or Graphical view) 

HYPEfUn^hrip://www.n.i^ n c t p://ww.n.noo^suLac.jp/-kunisawa 

Bacteriophage T4 genomic database compiled by Arisaka ei aL 

X95646 

Bacteriophage T5 DNA for region 60.S%-11% of the T5 genome 
gii2791557jerab|AJ0O119l|BTIOOU91 [2791557] 

(View GenBank report, FASTA reporUSN.l report.Graphical view,7 MEDLINE links. 12 protein links, or 6 nucleotide neighbors ) 
X56847 

Bacteriophage T5 genomic region encoding early genes D 1 0-D 1 5 
giil5407|emb|Xl29301MYT5D10(15407] 

(View GenBank rcport.FASTA rcpcrt^SN. 1 report,Graphic.l view, I MEDLINE link, 5 protein links, or 4 nucleotide ne.ghbors ) 
AF039886 

Bacteriophige T5 subclone T5.5Jr5. 18r, single pass sequence, genomic survey sequence 
gi|281U54|gb|AF039886|AF039886 [2811 154) * survey sequence 

(View GenBank repooFASTA rcportASN. 1 report, or Graphical view) 

AF039885 

Bacteriophage T5 subclone T5.40f,4 1 f, single pass sequence, genomic survey sequence 
gi!2811153|gb|Af039885|AF039885 [2811153] cy sequence 

(View GenBank repoaFASTA report^SN. 1 report, or Graphical view) 
30 AF039884 

Bacteriophage T5 subclone T5.26.fr, single pass sequence, genomic survey sequence 

gi|28 J !152|gb[Af 039884|AF039884 (28 1 1152) 

(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 

AF039883 

„ Bacteriophage T5 wbcl °Bc 10-T5.5JF, single pass sequence, genomic survey sequence 

35 g«|28MI5l|gb|AF039883jAF039883 (2811151) . 

(View GenBank repooFASTA report^SN. 1 report, or Graphical view) 

AF039882 

Bscteriophage T5 subclone 4 1-T5.5.4BF, single pus sequence, genomic survev seauence 
8 iJ2811l5O|gb|AF039882|AFO39882 [2811150] * 
(View GenBank rcportJrASTA reportASN. 1 report, or Graphical view) 



10 



15 



20 



25 



40 



45 



AF03988I 

Bacteriophage T5 subclone 39-T5.5.4aF, single pass sequence, genomic survey sequence 

gi]28l 1 149|gb)AF039881|AF039881 [28 1 1 149] 

(View GenBank repon,FASTA report, ASN.l report,Graphical view, or 1 

nucleotide neighbor ) 
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AF039880 

Bacteriophage T5 subclone I9-T3.7.2r, single pass sequence, genomic survey sequence 
gi|28lll4B|gb(AFO39880|AFO39880(28III48] 
(View GcnBank repon,FASTA reponASN.l report, or Graphical view) 

AF039879 

Bacteriophage TS subclone 18-T5.7.2F, single pass sequence, genomic survey sequence 
gi|281 J 147lgb|AF039879|AF039879 [281 1 147] 
(View GcnBank repooFASTA reponASN.l report, or Graphical view) 

AF039878 

Bacteriophage T5 subclone 1 1 -T5.5.7R, single pass sequence, genomic survey sequence 
gi|28U146|gb|AFO3987S|AF039878 [281 1146} 
(View GcnBank rcportJASTA rcporWSN.l report,Grapbical view, or 2 
nucleotide neighbors ) 

AF039877 

„ n Bacttriophage T5 subclone T5.4FR, single pass sequence, genomic survey sequence 

20 gil281U43igb|AFO39877|AF039877 [281 1145] 

(View GcnBank reporvFASTA report, ASK 1 report, or Graphical view) 

AF039876 

Bscteriopbage T5 subclone 22-T5. 1 6R. single pass sequence, genomic survey sequence 
gq281H441gb|AF039876|AF039876[2811144) 
(View GcnBank repor%FASTA repoaASKl report, or Graphical view) 

AF039875 

Bacteriophage T5 subclone 21-T5.16R, single pass sequence, genomic survey sequence 

gi|2811l43|gb{AF039875|AF039875 [2811143] 

(View GcnBank report, FAS TA report*ASN.l report, or Graphical view) 

AF039874 

30 Bacteriophage T5 subclone 2 1 -T5. 1 6T, single pass sequence, genomic survey sequence 

gi|28lll42jgbtAF039874|AF039874 [2811142] 
(View GenBank report^FASTA reporuASN. 1 report, or Graphical view) 

AF039873 

Bacteriophage T5 subclone 09-T5.6T. single pass sequence, genomic survey sequence 
gi(28 1 1 141|gbjAF039873{AF039873 (281 1 14 1] 
35 (View GenBank rcponJASTA reporuASN. 1 report, or Graphical view) 

AF039872 

Bacteriophage T5 subclone 09-T5.6R, single pass sequence, genomic survey sequence 
gi|28 1 1 140jgb)AF039872|AF039872 (281 ] 140} 

(View GenBank reporuFASTA report^SN.l report, Graphical view, or 2 nucleotide neighbors ) 

40 AF03987I 

Bacteriophage T5 subclone 04-T5.26.R, single pass sequence, genomic survey sequence 

gi|281 1 139JgblAF03987l(AF039871 [2811 139] 

(View GcnBank report,FASTA reporvASN.l report, or Graphical view) 

AF039870 

Bacteriophage T5 subclone 13-T5.42F, singie pass sequence, genomic survey sequence 
45 gfl28 1 1 138|gb(AF039870|AF039870 [281 1 138] 

(View GenBank repooFASTA reporuASN.l report, or Graphical view) 
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X69460 

Bacteriophage T5 Itf gene for L-shaped ttil fibers 
gi|15415|«nbpC69460IMYT5LTF (15415) 

(View GenBank rcpoaFASTA report,ASN. I report.Graphical view t 2 MEDLINE links, 1 protein link, or 4 nucleotide neighbors ) 
10 X03402 

Bacteriophage T5 D 1 5 gene for 5' exonuclease 
giJ154U|emb|X03402|MYT5EXOG (15413) 

(View GenBank report,FASTA reportASN.l report.Grapbical view.l MEDLINE lint 1 protein link, or 2 nucleotide neighbors ) 
211972 

' - Bacteriophage T5 tRNA-Tyr, tRNA-Glu, tRNA-Trp, tRNA-Phe, tRNA-Cyi and 

70 tRNA-Asn genes, and ORFs 91aa, 90aa, 42aa and I72aa 

gi|15795|embl2U972|T56TRNAG (15795) 

(View GenBank report,FASTA rcportASK 1 report.Graphical view, 1 MEDLINE link, 4 protein links, or 3 nucleotide neighbors ) 
X03898 

Bacteriophage T5 genes for tRNA-His, -Ser and -Leu 
gi!15786(embpC03898|STT5RNI (15786) 

(View GenBank repon,FASTA report,ASN. 1 repon,Grapbical view, or 2 MEDLINE links ) 
X04177 

Bacteriophage T5 gene for transfer RNA-Gln 
gi|15421}emb|X04177|MYT5TKNQ (15421] 

(View GenBank repon,FASTA repon^SN. 1 report,Graphical view. 1 MEDLINE link, or 2 nucleotide neighbors ) 

25 X03899 

Bacteriophage T5 genes for tRNA-Val, -Lyi, -fMet, -Pro and -De3 
gqi5787|embpC03899|S7T5RN2 (15787) 

(View GenBank report, FA ST A rcpoOASN. 1 report. Graphical view, or 1 MEDLINE link ) 
X03798 

Bacteriophage T5 gene for tRNA-Asp (GUC) 
30 gi|15472|embpC03798(NCT5TRDQ (15472) 

(View GenBank report.FASTA reportASN.l report, Graphical view,! MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 

Y00364 

Bacteriophage T5 tRNA gene cluster (27.8%-22.4%) 
g«|15420|emb|V00364|MYT5TRN (15420) 

(View GenBank report,FASTA rcpor%ASN.l repon,Graphical visw.l MEDLINE link, or 13 nucleotide neighbors ) 
X03I40 

Bacteriophage T5 DNA with Ac-dependent transcription terminator (Hind m-P rranneni) 
gqi54I7|embpC03140jMVT5RHO (15417) 

(View GenBank reponjASTA repcrt^SN. 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
Z35070 

40 Bacteriophage T6 DNA 

8i|535228|ernb|Z35074|MYEREGBT6 (535228) 

(View GenBank rcport,FASTA rcportASN.l report,GrapbicaJ view.l MEDLINE link, or 1 protein link ) 



45 



35 



50 



55 



WO 00/32825 



PCT/IB99/02040 



15 



224 

AFO6O870 

gi|3676458|gbiAF052605|AF0526O5 [3676458] 
/0 ( Vicw GcnBinJc rcpon,FASTA rcpooASN. 1 rcport,Griphical view.3 protein links, or 2 nucleotide neighbors ) 

235072 

Bacteriophage T6 DNA encoding ORF19.1 gene and gl9 gene 
gU535232Jcmb(Z35072{MYTAILT6 [535232] 

{View GenBank report,FASTA rcportASN. 1 repon.Graphical view, l MEDLINE link, or 2 protein links ) 
XI 2488 

Bacteriophage T6 gene 32 mRNA for single -stranded DNA binding protein 
gHI5843|emblX12488|MYT6G32 ( 15843) 
(View GenBank rcpon,FASTA repooASN.l repon,Graphical view.l MEDLINE link, » protein link, or 14 nucleotide neighbors ) 

Z78095 

Bacteriophage 76 DNA (1506 bp) 
gi|l488562lembI278095|BPH278095 [1488562] 

C Vicw GnBtak report^ASTA reporUVSN. 1 repon,Graphical view, 1 protein link, or 4 nucleotide neighbors ) 
Z35079 

Bacteriophage T6 DNA for IpS, Ip6 
gi|5352l5|«mblZ35079|rVfY57BT6 [535215] 

(View GenBank rcporUvASTA repoi^ASNJ repoxt,Graphical view.l MEDLINE link. 2 protein links, or I nucleotide neighbor ) 
25 X68725 

E.coli bacteriophage T6 gene for beu-grucosyl-HMC-alpU-glucosyl-transfense 
gi|296439|emb|X68725IECT6 [296439] 

(View GenBank report J" ASTA report^SN.l report.Graphical view.l MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
X69894 

Bacteriophage T6 alt gene for ADP-Ribosyltramf erase 
30 gill5422|emb|X69894(MYT6ADP [15422] 

(View GenBank repooFASTA report^SN.l report,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 

L46846 

Bacteriophage T6 rxd3. frd2 genes, complete cds 
gi|95l390}gb|L46846{PT6FRD32G [951390] 

(View GenBank reportjASTA reporvA.SN.1 report, Graphical view, or 2 protein links ) 

35 

M27738 

Bacteriophage T6 cranslarional repressor protein (regA), complete cds 
ga2l5993tgb|M27738{PT6REOA [215993] 

(View GenBank repooFAStA rcportASN. 1 rcport,Grapbical view, 1 MEDLINE link, 1 protein link, or 5 nucleotide neighbors ) 
M38465 

40 Bacteriophage T6 DNA ligase gene, complete cds 

gi|215991|gb|M38465(PT6LIG55 [215991] 

(View GenBank repon,FASTA repooASN. 1 repcrt,Graphical view, 1 MEDLINE link, 1 protein link, or 2 nucleotide neighbors \ 
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VOI 146 
Genome of bacteriophage T7 
giJ43 1 1 87|emb| V0 1 1 46JT7CG [43 1 1 87] 

1^^^ rcpoa} ; A f[ A W^X-l rcport.Gnpbic*l view, 13 MEDLINE links, 60 protein links. 105 nucleotide 
10 neighbors, or 1 genome link ) r IVJ nwcwoHac 

X60322 

Bacteriophage alpha3 genes A, B. K, C, D, E, J, F, G, H 
gi|14775|emb|X60322|BACALPHA [14775] 

(View GenBank report,FASTA rcpon^ASN. I rcpor s Graphical view, I MEDLINE link. 10 protein links, 22 nucleotide neighbors 
or i genome link ) * 

15 X13332 

Bacteriophage arpha3 DNA for origin of replication 
gijlS093|emb|X13332iMlA3ORPL [15093] 

(View GenBank rcporvFASTA repon,ASN.l report,Grapbical view, or 1 MEDLINE link ) 
X126I1 

« 0 Bacteriophage alpha3 gene for protein A pan., finger domain 

gi|l5092femb|X126l 1IMIA3AFIN (15092] 

(View GenBank repon,FASTA repooASN. 1 report,Graphical view. I MEDLINE link, 1 protein link, or 6 nucleotide neighbors ) 
XI5721 

Bacteriophage alpha} deletion mutation DNA for the origin region (-on) of reolication 
gqi4774|embJXl5721|BA3DMOR9 (14774] ■ P 

25 (View GenBaaj£ repon,FASTA repoitASN.l report,Grapbical view.l MEDLINE link, or 1 1 nucleotide neighbors ) 

X15720 

Bacteriophage a!pha3 deletion mutant DNA for the origin region («ori) of replication 
gqi4773|emb|Xl5720fBA3DMOR8 (14773] 

(View GenBank report,FASTA report^SN.l rcportGiaphical view.l MEDLINE link, or 1 nucleotide neighbor ) 
X15719 

30 Bacteriophage alpha3 insertion mutant DNA for the origin region (-on) of replication 

gqi4772|emb|X15719|BA3DMOR7 (14772] 

(View GenBank repoaFASTA rcporuASN.l rcport,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
XI5718 

Bacteriophage alpha3 deletion mutation DNA for origin region (-cri) of replication 
gi|l4771|emb(Xl57l8|BA3DMOR6 [14771] 
35 < Vi « w GenBank repoxt^ASTA rcpoxvASN. 1 report, Graphical view,I MEDLINE liak,"or 1 1 nucleotide neighbor* ) 

X15717 

Bacteriophage alpha3 deletion mutamt DNA for origin region (-on) of replication 
giI14770|emb|X15717|BA3DMOR5 (14770J 

(View GenBank reportFASTA reporuVSN. 1 report,Graphical view, 1 MEDLINE link, or 9 nucleotide neighbors ) 



40 XI5716 

Bacteriophage alpha3 deletion mutant DNA for origin region (-on) of replication 
gi|l4769fanb|Xl5716|BA3DMOR4 [14769] 

(View GenBank repon,FASTA reponASN.l report,Graphical view.l MEDLINE link, or 10 nucleotide neighbors ) 
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XI5715 

Bacteriophage alpha3 deletion mutint DNA for origin region (-on) of of renlkwon 
gi|M768)emb|X!57l5|BA3DMOR3 [14763] I" W« « rcpucanon 

(View GenBank report,FASTA rcpon^SN.l repon,Graphical view,, MEDLINE link, or II nucleotide neighbor, ) 
X157U 

Bacteriophage aipha3 deierion niuuni DNA for origin reeicn f-ori) of reolinrinn 
gt|U767jemb|X15714|BA3DMOR2 (14767) ' ICphanoa 

(View GenBank repor,FASTA repoaASN. 1 repoaGraphical view, , idEDUNE link, or 1 1 nucleotide neighbor, ) 
X15713 

Bacteriophage alpha3 deierion mutan: DNA for the origin region (-on) of reolicatian 
gi|14766|emb|Xl5713|BA3DMORI [14766] I «U« repuca&on 

(View GenBank repon,FASTA repooASN.l report.Graphicai view, , MEDLINE link, or U nucleotide neighbor, ) 
X620S9 

Bacteriophage arpha3 origin of cDNA syntheai, ( oriGA) 
gill4763|emb|X62059|AL3ORJGA[147«J 

(V,ew GenBank repon^ASTA repooASNJ report.Graphical view.l MEDLINE link, or 13 nucleotide neighbor, ) 
X62058 

Bacteriophage alpha3 origin of cDNA lyntheiis (oriAA) 
gi}14762|exnbIX620S8lAUORlAA [14762] 

(View GenBank report,FASTA repor^ASN. I report, Graphical view,] MEDLINE link, or 13 nucleotide neighbor, ) 
J02444 

Bacteriophage alpha3 origin of DNA replication 
Bi|166103lgb|J02444|AL3ORI [166103] 

(V>ew GenBank report,FASTA reporUSN.l repoxt,Grapfcical view., MEDLINE link, 2 protein Imka, or 12 nucleotide neighbor, ) 
M25640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|166101|gb|M2564O|AUHI> [166101] 

(View GenBank reponJASTA reporuASN. 1 repoaGraphical view, 1 MEDLINE link. 1 protein link, or 1 3 nucleotide neighbor, ) 
MI0631 

B !^S^? t alpha0 cklv *«c f« P^8= phi-X174 gene A protein 
gi|166099|gblMl063HAL3CSA [166099) 

(View GenBank repottFASTA reportASN.l repoaGraphical view.l MEDLINE link. 1 protein link, or 3 nucleotide neighbors ) 
X00774 

Bacteriophage alpha-3 gene J sequence 
giJ15431|exnb|X00774|NCBA3J (15431] 

(View GenBank report^ ASTA reporUSN.l repon.Gr.phic.1 view.l MEDLINE link. 3 protein link,, or 2 nucleotide neighbor, ) 
M2S640 

Bacteriophage alpha-3 H protein gene, complete cds 
gi|16ol01|gb|M25640JAL3HP [166101] 

(View GenBank report.FASTA reporUSN.i report,Graphical view.l MEDLINE link. I protein link, or 13 nucleotide neighbor, ) 
M10631 

B ?f£££^? Blpha " 3 c,cav *3 e ,ite for l*WC"4 gene A protein 
gi!166O99|gb|Ml0631|AUCSA [166099] 

(View GenBank reportFASTA rcport^SN. I report.Graphical view. 1 MEDLINE link, I protein link, or 3 nucleotide neighbors ) 
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J02459 22 7 

Bacteriophage lambda, complete genome 
gi|2l5l04|gb|J02459|LAMCG (215104] 

^gh^^lc^m!^ rep0rt,ASN ' 1 K ^ Gn ?^ MEDLINE links, 67 protein links, 190 nucleotide 

J02482 

Bacteriophage phi-X174. complete genome 
gi|21e5019|gb|J02482IPXlCG [216019] 

(View GenBank repon.FASTA repon.ASN.1 report,Graphical vie«U3*MEDLINE links. 1 1 protein links. 26 nucleotide neighbors 
or i genome link ) * 

15 J02454 

Bacteriophage G4, complete genome 
giI215415|gb|J02454|PG4CG [215415] 

« Uen^rS toJc r > P ° n,FASTA rCp0n ' ASN 1 «P«.G»P»*»I view,6 MEDLINE links. 1 1 protein links. 20 nucleotide neighbors, 
X60323 

20 Bacteriophage phiK complete genome 

gi|1478l l8jemb|X60323|BPHIKCG [14781 18] 

(View GenBank report ASTA rcporUSN.l rcportGraphical view, 10 protein links, 18 nucleotide neighbors, or 1 genome link ) 
L42820 

Bacteriophage BF23 tail protein (hrs) gene, complete cds 
gi}1048680|gb{L42820|BBFHRS [1048680] 
25 < View GenBank rcport,FASTA reporUSN.l repon.Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor j 

X54455 

Bacteriophage BF23 gene 17 and gene 18 
gi|14797|emb|X54455|BF23Pl8G [14797] 

(View GenBank repon.F AST A rcport^SN.l repon.Graphical view,2 protein imW 0 r 2 nucleotide neighbors ) 
M37097 

Bacteriophage BF23 DNA, right end of terminal repetition 
gqi66U5|gblM37097iBBFRIGH [166115] 
(View GenBank reponJASTA repooASN.l nrport,Graphical view, I MEDLINE link, or 2 nucleotide neighbors ) 

M37096 

Bacteriophage BF23 DNA, left end of terminal repetition 
35 . Pll66n4|gb}M37096lBBFLEFT[l66114] 

(View GenBank repcrt,FASTA repor^ASN. 1 report,Graphic5l view, 1 MEDLINE link; or 1 nucleotide neighbor ) 

M37095 

Bacteriophage BF23 A2-A3 gene, complete cds, and Al gene, 5* end 
gi|l66U0igb|M37095|BBFA2A3 [166110] 

(View GenBank reporuFASTA repooASKl report,Graphical view,2 MEDLINE links. 3 protein links, or I nucleotide neighbor ) 
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AF056281 

Bacteriophage BF23 clone W23 .mac5/6.l, genomic survey sequence 

gtJ3090930jgb|AF05628l|AF056281 [3090930] 

(View GenBank repooFASTA repoaASN. 1 report, or Graphical view) 
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AF056280 

Bacteriophage BF23 clone M23.mac3, genomic survey seouence 
gu3O90929|gb|AF056280)AFO56280 (3090929} ^ 
(V,ew GenBank report,FASTA reporUSN.I report, or Graphical view) 

AF056279 

B ^ P ,^ BF23 Cl0nc ^3 ^16721.34, genomic survey sequence 
gi!3090928|go|AF056279|AFO56279 [3090928] «equence 
(View GenBank repon,FASTA rcporcASN.l report, or Graphical view; 
AF05627B 

Bacteriophage BF23 clone W23.macl67l9.33, genomic survey seouence 
8i|3090927|gb|AF0i6278|AF05o278 (3090927? 
(V K w GenBank rcpon,FASTA report,ASN.l report, or Graphical view) 

AF056277 

B ^SS?^l eBF23 C,0ae b fl3jmel67l9.33 f genomic survey sequence 
8«30909261gb|AFO56277|AFO56277 (3090926] 
20 (Vlcw GiaBank ^rt,FASTA repooASN.l report, or Graphical view) 

AF036276 

B JSS2^ gC BF23 clonc b£ 23.macl2/9-9. genomic survey sequence 
8ii309092i|gbjAF056276|AF056276 (3090925J ^ 
(View GenBank report^ A ST A repcn^SN.I report, or Graphical view) 

25 AF036273 

B *«en°P b »S* BF23 clonc M23.macl 1/14-24, genomic survey sequence 
gi|309O924igb|AF036275|AFO56275 (3090924] * ^ 

(View GenBank report,FASTA reporuASN.l report, or Graphical view) 

AFQ56274 

Bacteriophage BF23 clone bC3.37r64r, geMmic survey seouence 
8iJ3090923|gb(AF036274|AFt)56274 [3090923] 

(V, C w GenBank report^ ASTA reporuASN.l repoaGrapbical view, or 3 nucleotide neighbors ) 
AF056273 

Bacteriophage BF23 clone bf23.54fc, genomic survey seouence 

£i!3090922|gb|AF0J6273|AF056273 (3090922] 

(V,ew GenBank reportfASTA reportASN.l report, or Graphical view) 

35 AF056272 

^^I^^Be BF23 clocc ci23.47f>jnacl0/7, genomic survey sequence 

gU3090921|gb|AF0362721AF056272 (3090921] 

(View GenBank report,FASTA reportASN.l report, or Graphical view) 

AF036271 

B ^ 0ph ' 8C BF23 cIoac bQ3 -23.66>, genomic survey sequence 
gU3090920|gb|AF056271|AF05627l (3090920] 
(View GenBank report.FASTA rcpottASN. 1 report, or Graphical view) 

AF056270 

Bacteriophage BF23 clone bf23.23.64f. genomic survey sequence 
gU30909 1 9|gb| AFOS6270I AF056270 (30909 19] 
(V K w GenBank report,FASTA repor^ASN. 1 report, or Graphical view) 
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AF056269 

Bacteriophage BF23 clone bf23.23.60r, gcoomic survey sequence 

gi|30909 l8|gb|AF056269|AFO56269 (3090918) 

(View GcnBink repon.FASTA repon t ASN.l report, or Graphical view) 

AF0S6268 

Bacteriophage BF23 clone bf23.23.60f, genomic survey sequence 
gi}30909 1 7|gb|AF056268|AF056268 (309091 7] 

(View GenBiak repon.FASTA repon^SR 1 repoaGraphicai view, or. 1 nucleotide neighbor ) 
AF0S6267 

Bacteriophage BF23 clone bf23.23.59r. genomic survey sequence 

gi|30909l6|gb|AF056267JAF056267 (3090916) 

(View GenBank repon.FASTA repon,ASN. 1 report, or Graphical view) 

AF056266 

Bacteriophage BF23 clone bf23.23.59f, genomic survey sequence 

gi0O9O915|gb|AFO56266JAFO56266 (3090915) 

(View GenBank report, FAST A report,ASN. 1 report, or Graphical view) 

AF056265 

Bacteriophage BF23 clone bf23.23.56r, genomic survey sequence 

gi]309O914|gb|AFO56265|AFO56265 (3090914) 

( View GenBank iepon,FASTA repooASN. 1 report, or Graphical view) 

AF056264 

Bacteriophage BF23 clone bC3.23.56f, genomic survey seouence 

gi|3090913Jgb|AF056264|AF056264 (3090913) 

(View GenBank report^ ASTA rcpooASN.l report, or Graphical view) 

AF0S6263 

Bacteriophage BF23 clone bf23.23.68f55r, genomic survey sequence 

gi|30909l2|gblAF056263lAF05fi263 (3090912) 

(View GenBank repor%FASTA reporUSN.l report, or Graphical view) 

AF056262 

Bacteriophage BF23 clone bf23.23.43fr.66f, genomic survey sequence 

gi|3090911|gb|AF056262|AF056262 (30909U) 

(View GenBank repon.FASTA reporVASN.l report, or Graphical view) 

AF05626I 

Bacteriophage BF23 clone bf23.23.2fr, genomic survey sequence 

gi|3O9O9I0|gb|AF056261|AF0J6261 (3090910) 

(View GenBank repcrvFASTA report^SN.l report, or Graphical view) 

AF056260 

Bacteriophage BF23 clone bf23.23.55.f, genomic survey sequence 

gi|3090909|gbtAF05626OlAFO5626O (3050909) 

(View GcnBink reponJASTA report^SN. 1 report, or Graphical view) 

AF056259 

Bacteriophage BF23 ctone b£23.23.53.r, genomic survey sequence 

gi|309O9O8|gb|AfO56259|AFO562S9 (3090908) • 

(View GenBank reponjASTA repoaASN.l report, or Graphical view) 
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AF0S6258 

Bacteriophage BF23 clone bf23.23.53.f, genomic survey sequence 

gi|30909O7|gblAFO56258|AFO56258 (3090907J 

(View GenBank rcport,FASTA repon,ASN.l report, or Graphical view) 

AF056257 

Bacteriophage BF23 clone bf23.23.52x, genomic survey sequence 

gii3090906|gb|AF056257|AF056257 [3090906} 

(View GenBank report,FASTA reporvASNM report, or Graphical view)* 

AF056256 

Bacteriophage BF23 clone bf23.23.52.f, genomic survey sequence 

gi|3090905|gb(AF056256|AF056256 {3090905) 

(View GenBank report,FASTA report.ASN.1 repon, or Graphical view) 

AF056255 

Bacteriophage BF23 clone bi73.23.49.r, genomic survey sequence 

gq309O9O4|gb|AF056255|AF056255 [3090904) 

(View GenBank report,FASTA rcporlASN. 1 report, or Graphical view) 

AF056254 

Bacteriophage BF23 clone bf23.23.49.f, genomic survey sequence 

gi|309O903|gb|AF056254|AF056254 (3090903) 

(View GenBank repcrt,FASTA reportASN. 1 report, or Graphical view) 

AF056253 

Bacteriophage BF23 clone bf23.23.48.r, genomic survey sequence 

gi|30909O2|gbtAF056253lAFO56253 [3090902) 

(View GenBank rcporwFASTA rcporuASN. I report, or Graphical view) 

AF056232 

Bacteriophage BF23 clone bf2323.48.C genomic survey sequence 

gi|309O9Ol|gbiAF0562J2|AJO56752 [3090901] 

(View GenBank repon,FASTA reportASN*. 1 report, or Graphical view) 

Af 056251 

Bacteriophage BF23 clone bf23.23.44.r, genomic survey sequence 

gi|3090900|gb|AF056251|AF03625I [3090900] 

(View GenBank reportJASTA reporvASN.l report, or Graphical view) 

AF056250 

Bacteriophage BF23 clone M23.23.4 l.f, genomic survey sequence 

gtl3090899|gb|AF056250|AF056250 [3090899] . 

(View GenBank report^ ASTA repor%ASN.l report, or Graphical view) 

AF056249 

Bacteriophage BF23 clone bf23.23.22.a.r. genomic survey sequence 

gi|3090898|gb|AF056249IAF056249 [3090898] 

(View GenBank report,FASTA report^ASN.l report, or Graphical view) 

AF056248 

Bacteriophage BF23 clone bf23.23.22.a.f, genomic survey sequence 

gi|3090897lgbJAF056248IAF056248 [3090897] 

(View GenBank report^ AST A reportASN. I report, or Graphical view) 
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AF056247 231 
Bacteriophage BF23 clone bi73.23.68.r, genomic survey sequence 
gq3090896JgbJAFO56247|AFO56247 (3090896) 
(View GenBank report,FASTA report,ASN.I report, or Graphical view) 

10 250 114 

Bacteriophage BF23 DNA for putative tail protein eene 
gq2464952|embJZ50M4|BF23LATE (2464952] 

(View GcnBiuk repoaFASTA rcpoaASN. 1 repon,Graphical view, or. I protein link ) 
D12824 

(V,ew GenBank report.FASTA reporUSN.l report.Graphical view.! MEDLINE link, 2 protein .inks, or 3 nucleotide neighbor, ) 
234953 

Bacteriophage K3 rp9, ip7 and rp8 genes 
gi|33326l!emb|234933lMYK3IP978 (535261] 
20 (V.ewGcnBankrepoaFASTArepoaASN.1 repoaGnphical view, ) MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 

235075 

Bacteriophage K3 DNA for Ip3 and Ip4 
gi|533229|cmb|Z35075|MYEORF64K (533229] 

(ViewGenBankreporUASTA repooASN.l repon,Graphical view, I MEDLINE link, or 2 protein links ) 
X05560 

25 Bacteriophage K3 gene 38 for receptor recognizing protein 

gi|15U2^^mbpC05560|MYK3G38 (15112] 

(View GenBank report^ ASTA repooASN.l repoaGraphical view.l MEDLINE link, or 1 protein link ) 
X04747 

Bacteriophage K3 gene 37 for receptor recognizing protein 
gi|15U0jemb|X04747|MYK3G37 (151 10] 
30 (View GenBank report, FASTA report^SN.l report,Graphical view.l MEDLINE link, I protein link, or 2 nucleotide neighbors ) 

X01754 

Bacteriophage K3 tail fiber gene 36 
gi|13l08|emb!X01 754/MYK3F36 [ 15 108] 

(View GenBank repoaFASTA repottASN. 1 report,Grapuical view, 1 MEDLINE link, or 2 protein links ) 
M16812 

Bacteriophage K3 *t lysis gene, complete cds 
gi|215503lgbjMI6812JFK3LVST (2155Q3J 
(View GenBank report,FASTA nrpor^ASN.l report.Gra P hical view,! MEDLINE link. 1 protein link, or 4 nucleotide neighbors ) 
L46833 

Bacteriophage K3 frd3, frd2 genes, complete cds 
40 8i|W1377|gb|L46833jPK3FRD32G (951377] 

(View GenBank report.FASTA report ASN.l repon.Graphical view.2 protein links, or 2 nucleotide neighbors ) 
L43613 

Bacteriophage K3 fibritin (wac) gene, complete cds 
gi|903861|gb|L43613|PK3 VVAC (90386!) 

(View GenBank repoaFASTA rcpoaASN.l report,Graphical view,l protein link, or 4 nucleotide neishbon \ 
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X01753 

Bacteriophage 0x2 tail fiber gene 36 
gi|15l22|embtX0]753|MYOX2F36 (15122] 

(View GenBank repon,FASTA rcpoRASN.I repan,Graphicil view, MEDLINE link, 2 protein units, or I nucleotide neighbor ) 

Bacteriophage 0x2 fibritm (wac) geae, complete cds 
giiW3848|gb|L43612;OX2WAC (903848] 

( V,ew GenBank report,FASTA report,ASN. 1 repon,Graphic,l view. , pnnein link, or 4 nucleotide neighbor, ) 
246880 

15 Bacteriophage 0X2 $tp gene 

gi|599663|emb|246880|BPOX2STP (599663] 

(View GenBank repon,FASTA report, ASN. I report,Graphical view, MEDLINE link, 1 protein link, or 4 nucleotide neighbors ) 
X05675 

20 (View GenBank repon,FASTA report^ASN. I r^Grophiea. view, MEDLINE link. 3 protein links, or . nucleotide neighbor , 

M33533 -ognoori 

repon.Graphical view,l MEDLINE link, 2 protein linka. or 2 nucleotide neighbor, ) 



25 



30 



45 



50 



.-(.uuj, *w «o usiuxaaonai repressor pr 

gi|2 1 6083jgb|M33533 IRfl 1 8 REG A (2 1 6083] 
( Vtew GenBank repon,FASTA rcp<n%ASN. 1 1 

AF033329 

(V,ew GeaBank repor^FASTA reporvASN.I repon.Gnrphicd view., protein link. =, , 1 nucleotide neighbor, ) 
M86231 

Bacteriophage RB69 gene 62, 3 end; RegA (rcgA) gene, complete cda 
gi|21S3S4|gb|M86231|Po962REGA (215334) 

(ViewGenB^ropcnJASTAropoaASN.1 repon.Gnphic, view, MEDLINE link. 2 pe, tti n link* ., I nucleotide neighbor , 
AF033332 

(V lC w GenBank repoaFASTA rcpor^ASN.l repoaGraphical view, 1 protein link, or 12 nucleotide neighbor, ) 
35 • U34036 

Bacteriophage RB69 DNA polymerase (43) gene, complete cds 
gi|1237125|gb(U34036JBRU34036 [1237125] ■ 

(V«w GenBank reponjASTA reporUSN.l repon,Gr«phical view,. MEDLINE link, or 1 protein link ) 



V01 145 



gi|t3557|emb|V0U45|PODOHl (I5557J 

(View GenBank rcport^FASTA repon^SN. I repoaOraphical view, or I MEDLINE link ) 



X05676 



(V,«w GenBank repoaFASTA report^SN.l repoaGnphical view, MEDLINE link, 3 protein linb. or 1 nucleotide neighbor ) 
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AF034575 

Bacteriophage Ml putarive integrasc (iai> gene, complete cds. and anP r«Jon 
gi|2662472|gb|AF034575iAF034575 (2662472) 8 * ^ ,CqUCnCC 

(View GenBank report.FASTA rcpoaASN.l repoaGraphieil view, 1 MEDLINE link, or 1 protein link ) 
AF03332I 

(View GenBink reponJASTA repoaASN.I rcpoaCphic. view., link. „ J7 .uekotifc eeighbon ) 
X53190 

g^Ts^x^ 

(View GenBank report,FASTA rcpoaASN, repor,Graphica. view, , MEDLINE link. 2 protein link,, or 2 nucleotide neighbor, , 
AF033334 

Bacteriophage Tulb Jingle-stranded binding protein (gene 32) ceae partial cds. and y 
gi|2M5798|gb|AF033334|AJ033334 [2645798) P ^ 5 fegt °° 

(View GenBank rcpon,FASTA repor^ASM.l report,<3raphical view. or 5 nucleotide neighbon ) 

X55191 

8CaC '^'°" c «* " (P^ cd,), 38 gene for receptowecognizing protein 38, 

gi|14863[emb|X5519ip3PTUIB (14863) 

( Vtew GenBank report,FASTA reportASN. 1 repon,Graphical view, I MEDLINE link, 3 protein link,, or 3 nucleotide neighbor, ) 
X13065 

Bacteriophage phi80 early region 
gi|14800Jerab|X13065|BP80ER (14800) 

(View GenBank report,FASTA repon,ASN. 1 ^Graphical view. . MEDLINE link, 8 protein link,, or 6 nucleotide neighbor, , 
D00360 

30 Bacteriophage phi80 cor gene 
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»«fcicnopnngc pnieu cor gene 
gi|217782|dbj|D00360|P8080COR (217782) 
(View GenBank report,FASTA report^SN.l report,Graphical view, or 1 protein link ) 



X01639 



Bacteriophage phi 80 DNA-fragment with replication origin 
gi|15828|emb|XOI639pCXPHI80 (15828) 
35 (View G«aB"k rcport^ASTA report^SN.l report,Graphical view, 1 MEDLINE link, or 25 nucleotide neighbon ) 

X04051 

Larnbdoid bacteriophage phi 80 int-xis region (inregrase-cxriiionase region) 
gt|15770|embjX0405I|S7PHI80X(lJ770) 

(V,ew GenBank reponJASTA repc^ASN.l report.Graphicai view.l MEDLINE link, 2 protein link,, or 1 nucleotide neighbor ) 
40 X06731 

Phage Phi80 DNA for major coat protein 
gi|15768|emb|X0675 1IS7PHI80C (13768) 

(View GenBank report,FASTA report^SN. I repcaGraphical view, 1 MEDLINE link. 1 protein link, or 11 nucleotide neighbors ) 



X75949 



Bacteriophage phi80 DNA for ORF xl7U and ORF xl7U8* 
gt|4588l l|embpC75949|ECORF171B (45881 1J 

(View GenBank reponJASTA report^SN.l repon,Graphical view.l MEDLINE link, 2 protein link,, or 28 nucleotide neighbor, ) 
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L404I8 

Bacteriophage phi- 80 gene, complete cds 
gi|l019l07|gb|L404l8|P80A (I0I9107) 

(View GenBank repon,FASTA reporUSN.l repon.Graphical view.l MEDUNE link, or 1 protein link ) 
M24831 

Bacteriophage phi-80 Tyr-tRNA gene, 3' end 
gil2lS363lgb|M2483IIP8GTGY [215363] 

(View GenBank repon.FASTA reporUVSN.I repon.Graphical view.l MEDLINE link, or 43 nucleotide neighbor, ) 
M 10670 

15 Bacteriophage phi-80 replicition origin 

gi!2 1 536 1 |gb|M 1 0670JP80ORI [2 1 5361 J 

(V iew GenBsnk report.FASTA repon^SN.l repon.Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
M24825 

Bacteriophage phi-80 RNA fragment 
gitfl5360|gb|M24825|P8OM3A (215360] 
20 < View G*nBank reponJASTA reporUSN.I rc P ort,Grapbical view.l MEDLINE link, or 1 nucleotide neighbor ) 

MU9I9 

Bacteriophage phi-80 cl immunity region encoding the N gene 
gq215358|gb|Ml 1919IP80CI f2l5358J 

{View GenBank repon.FASTA rcporr^SN.l report,GraphicaJ view.l MEDLINE link, 1 protein link, or 2 nucleotide oeighbors ) 
M10891 

Bacteriophage phi-80 artP site DNA 
gi|2t53571gbiM10S9!|P80ATTP [215357] 

(View GenBank rcportfASTA rcport^SN.l repan,Graphical view, I MEDLINE link, or 1 nucleotide neighbor ) 
Ml 9473 

g% c ;i^ 

( Vtew GenBank rcport,FASTA reportASK 1 reponGraphical view,2 MEDLINE links. 2 protein links, or 20 nucleotide neighbors > 
Y10775 

Bacteriophage 933W ileX. stx2A and stx2B genes 
gi|l9382O6jcinbjYI0775|BP933ILEX [1938206] 

(View GenBank repon.FASTA repoit^SN. I report,Graphical view.2 protein links, or 36 nucleotide neighbors ) 
X83722 

Bacteriophage 933W slt-UB gene 
gil)490229|emnpC83722|B933WSLT [1490229] 

(View GenBank reporuF AST A reportASN.t rcport,Graphical view.2 protein links, or 20 nucleotide neighbors ) 
X07865 

dn Bacteriophage 933W ilt-D gene for Shiga-like toxin typell subunit A and B 

W gi)14892|emb|X07865{BWSLTn [14892] 

(View GenBank repon^ASTA repoiV\SN. 1 report,Grapbical vie w,2 protein links, or 29 nucleotide neighbors ) 

M16625 

Bacteriophage H19B (from E.coli) sltlA and sltlB genes encoding Shiga-like toxin I subunits A and B. complete cds 
gi|215043|gb|Ml6625|H!9BSLT [215043] 
45 < View GcnB «* report,FASTA repoivASK 1 report,Graphical view, 1 MEDLINE link, 2 protein links, or 24 nucleotide neighbors ) 



25 



30 



35 



50 



55 



WO 00/32825 



PCT/1B99/02040 



Ml 7358 235 
Bacteriophage HI9B shiga-like toxin- 1 (SLT-I) A and B submit DNA, complete cds 

(View GenBank report,FASTA repooASN.I repon.Graphical view.l MEDLINE link, 2 protein links, or 20 nucleotide neighbori ) 
10 LT29728 

Bacteriophage N4 smgle*stranded DNA -binding protein (N4SSB) gens, complete cds 
gi|939708tgb|U29728|BNU29728 [939708] ^ 

(View GcnBank repon,FASTA report, ASN. 1 report,Grapbical view.2 MEDLINE links, or 1 protein link ) 
J02580 

Bacteriophage PA-2 (E.coli porcine strain isolate) Rz gene, 5cnd; ORF2, outer membrane porin protein (lc) and ORFl genes 
f 5 complete cds . ° 

gtl215366|gblJ02580|PA2LC (215366] 

(View GenBank report,FASTA report,ASN\ 1 repon,Graphical view, 1 MEDLINE link, 4 protein links, or 4 nucleotide neighbors ) 
U32222 

Bacteriophage 186, complete sequence 
gil3337249jgbjU32222JBIU32222 (3337249] 
20 < View GenBank reportFASTA rcport^SN.l repon,Graphical view.6 MEDLINE links. 46 protein links, or 5 nucleotide neighbors ) 

X51522 

Bacteriophage P4 complete DNA genome 
gi|4509l6|emblX51522|MVP4CG [450916] 

(View GenBank reponJFASTA report,ASN.l repon,Grapbical vicw,3 MEDLINE links, 13 protein links. 6 nucleotide nciahbo-s 
or 1 genome link ) ' ■ . 

25 X92588 

Bacteriophage 82 orf33. orfl5 1, orf56, orf96, rus, orf45, and Q genes 
gq 105 1 1 1 l|emb}X92588|BAC82HOLL [1051 111] 

(View GenBank report,FASTA repooASN. 1 report, Graphical view.7 protein links, or 1 nucleotide neighbor ) 
J02803 

on Bacteriophage 82 anntenninaoon protein (Q) gene, complete cds 

gi|215364|gb|J02803|P82Q (215364] 

( View GenBank reportJASTA repooASN. I reportGrapbical view, 1 MEDLINElink, or I protein link ) 
U02466 

Bacteriophage HK022 (cro), (ell) and (O) genes, complete cds, (P) gene, partial cds 
gi|4O7285|gbiU02466jBHU02466 [407285] 
35 ( Vicw GenB4nk «port,FASTA report^SN. 1 report,Graphical view, 1 MEDLINE link. 5 protein links, or 1 nucleotide neighbor > 

M26291 

Bacteriophage D108 regulatory DNA^inding protein (net) gene, complete edi 
gi|i66194|gb{M26291[D18NER(l66194] 

(View GenBank repoTtFASTA rcport^SN.I repon.Graphical view.l MEDLINE link, i protein link, or 1 nucleotide neighbor ) 

40 MU272 

Bacteriophage D 1 08 left-end DNA 
gi[l66193|gb|M11272(Dl8LEDNA (166193] 

(View GenBank rcpon,FASTA report^SN.l report, Graphical view,! MEDLINE link, or 2 nucleotide neighbors ) 
M18902 

Bacteriophage D108 kil gene encoding a replication protein, 3' end; and containing three ORFs, complete cds 
45 gi!16619l|gbfM18902|DI8KIL [166191] 

(View GenBank report^ ASTA reporv\SN. I report.Graphical view, 1 MEDLINE link, 1 protein link, or 3 nucleotide neighbors ) 
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MI0I9I 

Bacteriophage D!08, left end with Mu A protein binding site* LI and L2 
gijl66l9O|gb|M10l9l|D18BSL [166190] 

(View GenBank tepo*%FASTA repoaASN. 1 report,Graphical view, 1 MEDLINE link, or 5 nucleotide neighbor* ) 
J02447 

bacteriophage d!08 gene a 5' end 

gi| 1 66 1 89|gb| J02447|D 1 8AAA [ 1 661 89] 

(View GenBank report^ ASTA rcpon^SN.l repon,Graphical view, or* 1 MEDLINE link ) 
VO0865 

15 Bacteriophage D108 fragment from genes A and ner (C-terrninus of ner and N-cennicus of A) 

gi|l5437|emb|V00865|NCD 108 [15437] 1 
(View GenBank repon,FASTA repon^SN.l rcpon,Graphical view.l MEDLINE link, or 2 protein links ) 
X01914 

Bacteriophage IKe gene for DNA binding protein 
gi|l4957lemb)X019l4(INnCEDBP [14957] 

(View GenBank report.FASTA repon^SN. 1 repoaGraphical view.! MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
AF064539 
Bacteriophage N 15, complete genome 
gi|3l92683|gbjAF064539|AF064539 [3192683] 

« Sk7 POR,FASTA rCP ° aASN - 1 re P 0 * Gra Pfc«' *<w,2 MEDLINE links, 60 protein links. 26 nucleotide neighbors. 

25 U02303 

Bacteriophage Ifl , complete genome 
gi|3676280|gbIU02303|B2U02303 [3676280] 

(View GenBank repoaFASTA repooASN.l report,Graphical view, 10 protein links, or 1 genome link ) 
AF007792 

on Bacteriophage Mu late morphogenetic region 

8«P35 1775|gb|AF007792(AF007792 [3351775] 

(View GenBank repoaFASTA reportASN. 1 report.Graphical view, or 1 nucleotide neighbor ) 
U24I59 

Bacteriophage HP 1 strain HP 1 c I , complete genome 
gqi046235|gbjU24159|BHU24159 [1046235] 

35 . ilf^!?^ r f POrt ' FASTA "P 0 ***™-* tcport,Graphical view,6 MEDLINE links. 41 protein links, 8 nucleotide neighbors 

or i genome unit j ° 



20 



Z71579 

Bacteriophage S2 type A 5.6 kb DNA fragment 
gi|1679806|emb|Z71579|BPHSlADNA [1679806] 

(View GenBank report,FASTA rcpon.ASN. 1 report,Graphical view,3 MEDLINE links, 9 protein links, or 9 nucleotide neighbors ) 
40 X53238 

Klebsiella sp. bacteriophage Kl 1 gene 1 for RNA polymerase 
gi|14984|emb|X53238|KSKHRPO [14984] 

(View GenBank report,FASTA reportASN.l reporx,Graphical view.l MEDLINE link. 1 protein link, or I nucleotide neighbor ) 
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X85010 

Bacteriophage A5 1 1 plyS ] I gene 
gii85374S|emb|X850l0|BPA31 1PLY [853748 J 

U29728 " 

(View GcaBank report, FA ST A rcpoaASN.l repoaGr.phic.1 view.2 MEDLINE links, or 1 protein link ) 
J02445 

15 bacteriophage bo 1 3 '-terminal region ma 

giiI66l52|gb|J02445|BOlTR3 f 166I52J 

(V.ew GenBank report,FASTA repoaASN.l report Graphical view,! MEDLINE link, or 5 nucleoode neighbors ) 
L06183 

Bacteriophage L5 (from Leucooostoc oenos) genome 
g g289353IgblL06I831BUGENM {289353J 
20 (V|ew GcnBlnk "PWtjASTA report, ASN. 1 repoaGraphical view, or 1 genome link ) 

AF074945 

^/^f^^* 1 * ^"riophage MAV1, complete genome 
gU35 1 1243|gb|AF074945{AF074945 [35 1 1243] 

(VtewGenBank report.FASTA repoaASN.l repoaGraphical view, 15 protein link,, 3 nucleotide neighbors, or 1 genome link , 
25 LI3«96 

Bacteriophage L2 (from Mycoplasma), complete genome 
gi)2893381gb|Ll36961BL2CG [289338] 

( V«w GenBank repoaFASTA repoaASN.l repoaGraphical view,3 MEDLINE links. 14 protein Unfa, or 1 genome link ) 
X80191 
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B -?fl!!i 0pha8e PP7 mRNA for naturation, coat, lysis and rtplicase proteins 
gi|517237|embpC8019l(BPP7PR [517237] ^ P 

(V,cw GenBank repoaFASTA repoaASN.l repoaGraphical view,I MEDLINE link, 4 protein links, or I genome link ) 



M19377 



Bacteriophage Pf3 from Pseudomonas aeruginosa fNew York « train ^ rarmUn* o 
giai5380|gb|MI9377|PF3COMNY [215380] complete genome 

(V.ew GenBank repooFASTArepoaASN.l repoaGraphical view. 1 MEDLINE link, 9 protein links, or 5 nucleoride neighbors ) 



Ml 1912 



B -!fI^S5H gc PD from Pseudomonas aeruginosa (Niimegen strain), complete t 
gt|215371|gb|MU912|PF3COMN [215371) completes 

ge^ome^r * P °* FASrA 1 '^<^«1 view, 1 MEDLINE link, 9 protein links, 5 nucleotide neighbors, or I 

V006O5 

Bacteriophage Pfl gene encoding DNA binding protein 
gi|14970jemb|V00605|INOPFl [14970] 

(View GenBank repoaFASTA rcpoaASN.l repoaGraphical view, 1 proteine link, or 1 nucleotide neighbor ) 
L05626 

Bacteriophage PR4 capsid protein (P6) gene, complete cds 
45 8i|215735|gb|L05626|PR4P6MAJA [215735] 

(V«w GenBank repoaFASTA rcpoaASN.l repoaGraphical view.l MEDLINE link. 1 protein link, or I nucleotide neighbor ) 
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"238 

D13409 

10 (View GenBank report,FASTA teport^SN. ] r^C™,, MEDLINE link, 3 protein links, or 3 nucleotide neighbors ) 

D 13408 

Bacteriophage phiCTX (isolated from Pseudomelia* aeruginosa) cosL cot 
gU2I777S|dbj|DI3408|BPHCOSLCTX [217775] CropD0SiJ "* 8COM 

(View GcnBank reporvFASTA reporUSN. 1 rcpor,Gr.phical view.2 toLINE links, or 3 nucleotide ocighbon , 
M24832 

15 Bacteriophage f2 coat protein gene, partial cds 

gil 1 66228|gb|M2483 2|F2CRNACa [ 1 66228] 

(ViewGenBankrepon.FASTArepoaASKl repon.Graphic.1 view, MEDLINE link, I protein link, or 4 nucleotide neighbors , 
S72011 

Bacteriophage 2 1 isoettrate dehydrogenase (icd) and integrase (ini) acnes Dim*. 
gi|261B967|gb|Af0l7629|AF0I7629 (2618967] ' V™*"™ cds 

20 (View GenBank repon,FASTA repoxvASN. 1 report,Gr.pbieaI view,, MED LINE link, 2 protein links, or 44 nucleotide neighbors ) 

AFO 17628 

B iSSSS?E 21 " ocitratc dehydrogenase (icd) and integrase (ini) genes, panial cds 
g»J2618964|gblAF017628|AF017628[2618964] pvoai cos 

(View GenBank repon,FASTA report, ASK. 1 rtport,Grapnical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbor, ) 

25 AF017627 

Bacttriopha8€ 2 1 kMittwe debydrogenase (icd) and integrase (int) eenes turritl eA* 
8iJ261896l|gb|AF0l7627|AF0I7627 '[2618961] 8 ' ^ 

(View GenBank report.FASTA reporUSN.l ^Graphical view.] MEDLINElink, 2 protein links, or 44 nucleotide neighbors , 
AF017626 

g^58^ 

(View GenBank reportfASTA rcpooASN. 1 report,Grapbical view.l MEDLINE link. 2 protein links, or 49 nucleotide neighbors ) 
AF017625 

B -!2SSS*v 2 l " ociwtc dehydrogenase (icd) and integrase (int) genes, panial cds 
gi]26l8935|gb|AFOI7625|AF017625 [2618955] ws.paraaicos 

(View GenBank repoaFASTA reporvASN. I re P on,Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors , 
AF017624 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (inOxeoea. oarrial edi 
gii26l8952|gb|AF017624|AF017624 [2618952] * 9M 

(View GenBank report,FASTA itportASN. 1 report.Graphical view.l MEDLINElink, 2 protein links, or 44 nucleotide neighbors ) 
AF0I7623 

40 S!5^j2!5Sf. ?i " ociffatc dehydrogenase (icd) and integrase (int) geqea, partial cds 

W ^26J8949|gb|AF017623lAFOl7623 [2618949] «.pamaicas 

(V,ew GenBank repon.FASTA report^SN.l report,Graphical view.l MEDLINE link. 2 protein links, or 44 nucleotide neighbors ) 
AF0I7622 

^iSftf?. 2 ! " ocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gH2618946|gb|AF0I76221AF017622 [2618946] .puxuucQs 
45 (V,ew GenBank repon,FASTA npon.ASN.1 repon,Gnphical view.l MEDLINE link. 2 protein links, or 44 nucleotide neighbor, ) 
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AF017621 239 

Bacteriophage 21 isocitrate dehydrogenase (icd) and inicgrase (im) genes partial cds 
gU26 1 8943|gbJA FO 1 762 1 |AF0 1 762 1 T26 ! 8943 J B ' P ' * 

(View GenBank report, FASTA repoaASN.l repoaGraphical view,, MEDLINE link, 2 protein links, or 44 nucleotide neighbors , 
f0 D26449 

(V, C w GenBank repoaFASTA repoOASN.1 repoaGraphical view, or 2 protein links ) 
X87627 

Bacteriophage D31 12 A and B genes 
/ 5 gi|97476BiemblX87627jBPD3 1 12AB f974768] 

(View GenBank report, FASTA report,ASN.l repon.Graphica. view,, MEDLINElink, 2 protein links, or 1 nucleotide neighbor ) 
U32623 

Bacteriophage D3 transcriptional activator CII (cU) gene, complete cds 
gi|984852|gb|U32623jBDU32623 [984852] "mpietccos 

(View GenBank repoaFASTA repoaASN.l repoaGraphical view.l protein link, or I nucleotide neighbor ) 
20 L34781 

( V,ew GenBank repon,FASTA repooASN. 1 repoaGraphical viewj MEDLINE link, 4 protein links, or 2 nucleotide neighbors ) 
L 148 10 

25 ^KI2?X 911 (8pl0) gcnc * coin P lett cd * ™* («P26) gene, complete cds 

8»l294053|gb[LI4810|P22GP1026X [294053] 

(View GenBank repoaFASTA repoaASN.l repoaGraphical view,, MEDLINE link, 2 protein links, or 2 nucleotide neighbors ) 
X87420 

Bacteriophage ES 1 8 genes 24. c2, cro, el, 18, and oL and oR operators 
gqi 143407|embfX87420|BPESI8GEN (1 143407] 
30 (V|WG ^^^°* FASTA ^«^ 
L42820 

Bacteriophage BF23 tail protein (bra) gene, complete cds 
gi|l04868%b|L42820|BBFHRS [1048680] 

(View GenBank repoaFASTA repoaASN.l repoaGraphical view.l MEDLINElink. I protein link, or I nucleotide neighbor ) 
XI 4980 

35 . Bacteriophage PRD1 XV gene for protein P15 (lytic enzyme) 

gill5802|emb!Xl4980rrEPRDIXV [15802] 

(View GenBank repoaFASTA repoaASN.l repoaGraphical view.l MEDLINElink. 1 protein Link, or 4 nucleotide neighbors ) 
X06321 

Bacteriophage PRDl gene 8 for DNA terminal protein 
gijl5800jemb|X06321|TEPRD18 (15800) 

(View GenBank repqaFASTA repoaASN.l repoaGraphical view.l MEDLINE link, 2 protein links, or 10 nucleotide neighbors ) 
X14336 

Filamentous Bacteriophage 12-2 genome 
gi|14920|exnb)Xl4336|INBI22 [14920] 

«nT G knf rep0rt,FASTA rc P°«» A SN. I repoaGraphical view.l MEDLINE link. 9 protein links, I nucleotide neighbor, or 1 
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LO5001 240 
Bacteriophage X glucoiyl transferee gene, complete cds 
gi|216044|gbH0500llPXFCLUSVLT (216044) 

(View GenBank repcn,FASTArepcaASN.l report.Grapbicai view, 1 MEDLINE link, or I protein link) 
M29479 

Bacteriophage p4 sid and psu genes partial cds, and delta gene, complete cds RiJ215701l 
gb|M29479|PP4SDP (215701) ^ BW»/ui| 

(View GenBank repon,FASTA reporUSN.l report, Graphical view, 3 protein links, or 4 nucleotide neighbors ) 

SEG_PP4PSUSID 

Bacteriophage P4 capsid size deterrninarion protein (sid) acne. 5' end 
15 gi|215698|gb|JSEG_PP4PSUSID (215698) 

(View GenBank report,FAST A report,ASN. 1 report,Graphical view, 1 MEDLINE link, 2 pro:ein links, or 1 nucleotide neighbor ) 
M29650 

Bacteriophage P4 polarity suppression protein (psu) gene, complete cds 
gi|215697|gb|M29650|PP4PSUSro2pi5697) ^ 
(View GenBank report,FASTA report^SN. 1 report, or Graphical view) 

20 M29651 

Bacteriophage P4 capsid size determination protein (sid) ceae. 5' end 

gi(2l5696lgblM29631fPP4PSUSIDi (215696) 

(View GenBank repon,FASTA repoit^SN. 1 report, or Graphical view) 

M27748 

9 c Bacteriophage P4 gop, beta, and ell genes, complete cds and int gene, 3' end 

25 gi|2I369l|gb|M27748|PP4GOPBC [215691) 

(View GenBank reportfASTA rcportASK.l repon.Graphical view. I MEDLINE link. 4 protein links, or I nucleotide neighbor ) 
K02750 

Bacteriophage IKe, complete genome 
giJ21506llgb|K02750JIKECG [215061) 
30 ^en^^r rep0fVFASTA W**™ * report,Graphiral view.l MEDLINElink, 10 protein links. 4 nucleotide neighbors, or I 

L40418 

Bacteriophage phi-80 gene, complete cds 
gij 10 1 9 1 07|gb|L404 1 8(P80A ( 1 0191 07) 

(View GenBank report,FASTA report^SN.l report, Graphical view, I MEDLINE link, or 1 protein link ) 
35 AF032122 

Bacteriophage Sffl mtegrase (int) gene, partial cds; and bactoprenol ghxosyl transferase (bgt), and glucosyl tranferase U (atrll) 
genes,complete cds 

gi|24654121gb|AF021347|AF021347 (2465412) 

(View GenBank report^ AST A report^SN.l repottGraphical view.l MEDLINElink, 4 protein links, or 2 nucleotide neighbors ) 
M35825 

Bacteriophage SF6 fragment D lysozyrac gene, complete cds 
gi|216l05|gb|M35825|SF6LY2 (216105) 
(View GenBank repon,FASTA rcpczt^SN. 1 rcport,Graphical view, or I protein link ) 

235479 
Bacteriophage C 1 6 tp I gene 
gi|534936|emb|235479lBC16IPl (534936) 
45 < View GcaBlnk rrporvFASTA report^ SN. 1 repon,Graphical view. I MEDLINE link, 1 protein link, or 2 nucleotide neighbors ) 
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X12638 

Bacteriophage 2 1 DNA for gene 2 
gi!296141|embPC12638[B21GENE2 [296141] 

(View GenBank rcport»FASTA report,ASN.l repoit,Graphical view.l MEDLINE link, 1 protein link, or 1 nucleotide neighbor ) 
X02J01 

Bacteriophage 2 1 DNA for left end sequence with gene* 1 and 2 
gi]15825|emb|X02501|XXPHA21 [15825] 

(View GenBank repon^ASTA report^SN. I report,Grapbical view.l MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M65239 

Bacteriophage 2 ] lysis genes S, R, and Rz, complete cds 
gQ2 1 5466jgb|M65239|PH2LYSGEN [2 1 5466] 

(View GenBank report,FASTA repon^SN.l report,Graphical view, I MEDLINE link, 3 protein links, or 1 nucleotide neighbor ) 
M58702 

Bacteriophage 2 1 late gene regulatory region 
gi|2 1 5465|gb|M58702|PH2LATEGE (2 1 5465] 

(View GenBank report,FASTA report,ASN. I rcport,Grapbical view, or I MEDLINE link ) 
M81255 

Bacteriophage 21 head gene ope run 
gij215454|gb|M81255|PH2HEADTL [215454] 

(View GenBank report^ ASTA reporvASN.l report,Grapbical view,2 MEDLINE links, 10 protein links, or 4 nucleotide neighbors ) 
M23775 

Bacteriophage 2 1 glycoprotein 1 gene, complete cds, and glycoprotein gene, 5' end 
gi|2l5451|gb|M23775|PH2GPA[2I545l] 

(View GenBank report, FAST A reponASN. 1 repan,Grapbical view, I MEDLINE link, 2 protein links, or 3 nucleotide neighbors ) 
M61865 

Bacteriophage 2 1 excisionase (xis), integrase (int) and isocitrate dehydrogenase (icd), complete cds 
gi|215448|gb|M6l865jPH22XISAA [215448] 

(View GenBank report,FASTA repoit^SN. 1 report,Graphical view, 2 protein links, or 9 nucleotide neighbors ) 
S720U 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|26l8967|gb]Af017629!AF017629 [2618967] 

(View GenBank reportJASTA report^ SN.l report,Grapbical view, 1 MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AF017628 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi]2618964|gblAF017628[AF017628 [2618964] 

(View GenBank report! A ST A rcpor^ASN.l report. Graphical view, 1 MEDLINE link, 2 protein Links, or 44 nucleotide neighbors ) 
A5017627 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) and integrase (int) genes, partial cds 
gi|2618961|gbjAF017627lAF0I7627 [2618961] 

(View GenBank rcport,FASTA reportASN.l report, Graphical view.l MEDLINE link, 2 protein links, or 44 nucleotide neighbors ) 
AJF0I7626 

Bacteriophage 2 1 isocitrate dehydrogenase (icd) gene, partial cds; and integrase (int) gene, partial cds 
gi|26l8958[gb|AF017626{AF017626 [2618958] 

(View GenBank rcport,FASTA rctort^ASN.l report,Graphical view,l MEDLINE link, 2 protein links, or 49 nucleotide neighbors ) 
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CV.cwG.nB** rffpM , FASTA repoltASN . | ^ ^ 2 ^ ^ oj(M ^ ^ 

AF0I7624 

(V«w GcnB** «pcaFASTA rcpooASN. . vic». , MEDLINE .ink. 2 pn, teia Un*. ., 44 nuCoodc neighbor , 

AF017623 

, 5 (VKwGcnJtanic -cpoaFASTA re po^SN.. rcpon.Gr^hic.. vicw.l MEDLINE link. 2 p^in links . or 44 .uetotide neighbor , 

AF017622 

<V«» GdUnk repoaFASTA .epo^SN.. rt pon.Gnpbi=.l view., MEDLINE 2 pro** Unfc. o, 44 aMit cnic ocigbbon . 
20 AF0I762I 

<V,ew G«B«k reponJASTA reporvASN. . repor.Gr.pbicl v«w, ! MEDLINE **. 2 p rotei n Unk, o, 44 nuefcotide Mig bl*„ , 
M57455 

(V«wGcnBa n kr e port,FASTArcpon,ASN.t re pon,Gr.phical view,, protein link, or 9 nucleotide neighbor ) 



YI2633 



B ^^^ Se 85 DNA * P™"** sequence of unknown gene 

g!j2058285jeinb|Y12633|B85PROM [2058285] 

(View GenBank report,FASTA report^SN.l report, or Graphical view) 



X98I46 

Bacteriophage PI DNA sequence around the Op88 



7.T.Jr r ° *" ,rt icqucoce arouna tne upea operator 
8*|1359513|emb|X98146,BP10P880P [1359513] 
(View GenBank report.FASTA report^SN.l rcpoi^Graphical view, or 1 nucleotide neighbor ) 
Y07739 

« Staphylococcus phage Twort holTW, plyTW genes 

8q2764979|embiY07739|BPTWGHOLG [2764979] 

(View GenBank report^ASTA report^SN. 1 report, Graphical view, or 2 protein links ) 
L07580 

40 (Vjew GcnBink reponjASTA rcpoxtASN.l repon,Graphical viewj MEDLINE link, or 2 protein links) 

M34832 

^i^^^* 1 1 («"> «d excisionase (xis) geocs, complete cds 

gi}166157|gb|M34832|BPHINTXIS [166137] 

(View GenBank report^ ASTA rcportASN. 1 rcport,Graphical view. 1 MEDLINE link. 2 protein links, or 2 nucleotide neighbors ) 
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M20394 

Bacteriophage phi- 1 1 S. aureus attachment site (attP) 
giJ166l56|gb|M20394}BPHATTP [166156] 

(View GenBank repon,FASTA report,ASN.l repon,GraphicaJ view, 1 MEDLINE Jink, or 4 nucleotide neighbors ) 
X23128 

Bacteriophage phi- 13 integrate gene 
gtf758228|embpC82312fPHI13INT [758228] 

(V.ewGcnBankrcport.FASTArepor.ASN.l repoaGrapbical view, , protein link, or 3 nucleotide neighbor, ) 
X6I7I9 

15 ^5JSS Ph '* l l ,ySOgcn right c ^™«orne/bacteriophage DNA junctior 

10 8M46625|emb|X6l719|SAP13RJNC [46625] 

(ViewGenBankrcpon,FASTAre P oraSN.l report,Graphical view, or 1 MEDLINE link) 
X61718 

S aureui phi- 13 lysogen left crux>mc«cmal/bacteriophage DNA junction 
. 8»r46624|emb|X617i8|SAPl3LJNC [4^624] U ™ JUDCnon 

20 (ViC * GcnB * nlc "POffJ ASTA report.ASN. I report,Graphical view, or 1 MEDLINE link ) 

X61717 

Bacteriophage phi- 13 core sequence for attachment 
8i|14799|emb|X61717|BPl3ATTP [14799] 

(ViewGenBank repor^ASTA reporUSN.l reporuGraphical view,2 MEDLINE links, or 3 nucleotide neighbor. ) 
U0I875 

**3 tt 7^ 

(View GenBank rcpooFASTA rcporUSN.l repoaGrapbical view t 3 MEDLINE links, or 4 nucleotide neighbor, ) 
X67739 

S.aureus Bacteriophage pbi-42 attP gene 
30 gi|M809|emb(X67739|BPATTPA [14809] 

(View GenBank report^ASTA reporUSN.l repon.Grapbica] view. 1 MEDLINE link, or 3 nucleotide neighbors ) 
U01872 

Bacteriophage pbi-42 integrase (int) gene, complete cds 
giK37 1 15|gb|U01872)U01872 [4371 15] 
^ (V,ew GenBank reportFASTA repooASN.l repoaGrspbicai view,3 MEDLINE link,, 2 protein links, or 3 nucleotide neighbors ) 

X94423 

(Vtew GenBank report.FASTA rcpoaASKl repoaGrapbical view^ protein links, or 1 nucleotide neighbor ) ■ 
M27965 

40 h £f23? a * C U4a (from s »«eus) int and xis genes, complete cds 

gil215096|gb|M27965(L54INTXIS [215096] 

(View GenBank repoaFASTA repooASN.l repoaGrapbical view. MEDLINE 1 link, 2 protein links, or 3 nucleotide neighbors ) 
U72397 

Bacteriophage 80 alpha holin and amidase genes, complete cds 
gi|1763241jgb|U72397JBBU72397 [1763241] 
45 (View GenBank «pon,FASTA repooASN. 1 report, Graphical view,2 protein links, or 2 nucleotide neighbors ) 
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AB009866 

Bacteriophage phi PVL proviral DNA, complete sequence 
gil3341907|dbj|AB009866|AB009866 [3341907] 

(View GenBank report,FASTA reporUSN.l rcport,Gtaphtcal view,63 protein link*, or 1 nucleotide neighbor ) 
247794 . 

Bacteriophage Cp- 1 DNA, complete genome 
gi|2288892|ernb|Z<7794|BPCPlXX [2288892] 

(View GenBank reporUASTA rcport.ASN.1 repon,Graphical view,3 MEDLINE links, 28 protein links, 1 nucleotide neighbor, or 
1 genome link ) * 

SEG_CP7RSIT 
Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gi| 1661 86igbf|SEG_CP7RSIT ( 1 66186] 

(View GenBank report, FASTA report.ASN.1 report.Graphical view, or 1 MEDLINE link ) 
Ml 1635 

Bacteriophage Cp-7 (S.pneumoniae) DNA. 3' inverted terminal repeat 
gt]166t85!gb|Ml 1635JCP7RSIT2 [166185] 

(View GenBank report,FASTA reportASN.l report, or Graphical view) 
Ml 1636 

Bacteriophage Cp-7 (S.pneumoniae) 5' inverted terminal repeat 
gqi66184|gb]Mll636|CP7RSm [166184] 

(View GenBank rcpooFASTA reportASN.l report, or Graphical view) 

SEG_CP5RSrT 
Bacteriophage Cp-5 (S.pneumoniae), S inverted terminal repeat 
gil 1 66 1 8 1 IgbOSEG.CPSRSIT f 1 66 18 1 ] 

(View GenBank report .FASTA report^iSN.l repcn,Graphical view, or 1 MEDLINE link ) 
Ml 1633 

Bacteriophage Cp-5 (S.pneumoniae) 3' inverted terminal repeat 
gi!1661801gb|M11633|CP5RSrT2 [166180] 

(View GenBank report, FA ST A report^SN.l report, or Graphical view) 
Ml 1634 

Bacteriophage Cp-5 ( S.pneumoniae), 5 1 inverted terminal repeat 
gqi66179|gb|M11634|CP5RSm (166179) 

(View GenBank report^ ASTA reportASN.l report, or Graphical view) 
M34780 

. Bacteriophage Cp-9 "»'"mMiv (cpl9) gene 
gi|166187|gbiM34780|CP9CPL (166187) 

(View GenBank report^ ASTA reportASN.l report,Gnrphicai view.l MEDLINE link, 1 protein link, or I nucleotide neighbor ) 
MJ4652 

Bacteriophage HB-3 amidase (hbl) gene, complete cds 
gq215055|gb|M34652|HB3HBLA [215055] 

(View GenBank report, FA ST A reportASN.l report,Graphical view.l MEDLINE link, or I protein link ) 
U64984 

Streptococcus pyogenes phage T12 repressor, excisionase (xis), integrase(int) and crythrogenic toxin A precursor (speA) genes, 
complete cds gij 1877426|gb|U40453[SPU40453 (1877426] 

(View GenBank report.FASTA reporuASN.l report,Graphical view,2 MEDLINE links, 4 protein links, or 22 nucleotide neighbors ) 
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X 12375 

Phage CP-T1 (Vibrio choierae) DNA fcr packaging signal (pac sice) 
gi|l5435|emb|Xl2375[NCCPPAC[15435) 

(View GenBank report,FASTA rcport^SN.l repon,Grapbicol view.l MEDLINE link, or 1 protein link ). 

10 AF087814 

Vibrio choierae filamentous bacteriophage fs-2 DNA, complete genome sequence 
gi|3702207|dbj|AB002632{AB002632 (3702207] 

(View GenBank repcrt,FASTA repoaASN. I repon,Gzaphical view, l MEDLINE link, 9 protein links, or 1 genome link ) 
D83518 

i c Bacteriophage KVP40 gene for major capsid protein precursor, complete cds 

10 Sii3046858|dbj|D835l8|D835l8 [3046858] 

(View GenBank rcport,FASTA repoaASN. 1 repon,Graphical view, I MEDLINE link, or I protein link ) 

AF033322 

Bacteriophage PST singlcstranded binding protem (gene 32) gene, partial cds, and 5' remon 
gi|2645774|gb|AFO33322iAF033322 (2645774) 
20 f View GcnBaak reporVFASTA report, ASN.l reportGrapbical view, 1 protein link, or 1 7 nucleotide neighbors ) 

X94331 

Bacteriophage L cro, 24. c2, andcl genes 
gi|l469213|emb|X94331|BLCR024C [1469213] 

(View GenBank repon,FASTA repc^ASN.l report,Graphical view.l MEDLINE link, or 4 protein links ) 
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30 



U82619 

Shigella flexneri bacteriophage V glucosyl transferase (gtrX integrate (int) and excisionase (xia) genes, complete cds 
g»J2465470!gb|U82619|SFU82619 (2465470] «"P*w«b 
(View GenBank report^ASTA repooASN.l report,Graphical view.l MEDLINE link, 8 protein links, or 1 nucleotide neighbor ) 
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table 12 



NCB1 Entnz Nucleotide QUERY 
Key words: bacteriophage and lysis 
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56 citations found (all selected) 



AJ01158I 

o^kteoif CPSU9I),iS gC0CS 13, 19, l5,Ud P BClM « in «« e, «3. 
gii3676t)o^emblAJ0n58llBPS01158l [3676084] 

(VicwGcaBank report JASTA report.ASN.1 report,GraphicaJ view,4 protein 
links, or 1 nucleotide neighbor ) 

AJ0I1580 

Bacteriophage PS34 lysis genes 13. 19. 15. aoU terminator gene 23, and 
packaging gene 3, complete cds 
gi3676tf78lcmbiAJ01 15801BPS01 1580 [3676078] 

(View GenBank reportFASTA report.ASN.1 re port. Graphical view,5 protein 
links, or 2 nucleotide neighbors ) 



AJ011579 



(View GenBank reportFASTA report ASN.l reportGraphicai vicw.4 protein 
links, or 1 nucleotide neighbor) 



AHB4975 

Bacteriophage H-19B essential recombination function protein (erf), til 
protein (kil). regulatory protein cin (cIH). protein gplf (17). N 
protein (N), d protein (cl), cro protein (cro). ell protein (ell), O 
protein (OX P protein (P). ren protein (rcn), Roi (rot), Q protein (Q), 
Shiga-like toxin A (slt-lA) and B (slt-(B) subunits. and putative hoi in 
protein (S) genes, complete cds 
gia653751lgblAF(B4975 [2668751] 

(View GenBank reportFASTA reportASN.l reportGraphicai view, 1 MEDLINE 
link. 20 protein links, or 30 nucleotide neighbors ) 



(View GenBank reportFASTA reportASN.l rcporUOnphicai view.2 MEDLINE 
links, I protein link, or 9 nucleotide neighbors ) 



E. coli MIA locus encoding the hJlX, hilK and hflC genes, hfq gene, 
complete cds; miaA gene, partial cds 
gU4361S3lgbIU00005IECXDHFLA [436153] 

(View GenBank reportFASTA reportASN.l reportGraphicai vicw.4 MEDLINE 
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links. 5 protein links, or 8 oudcocide neighbor? a 
U32222 

Bacteriophage 186, oompiete sequence 
g iD337249lgblU32222iB 1 U32222 13337249] 

(View GenBank repon,FASTA report^ SN.l rcpart,Graphical vicw,6 MEDLINE 
links. 46 protein links, or 5 nucleotide neighbors ) 

AF064539 

Bacteriophage N15. complete genome 
giD192683lgblAF064539IAF064S39 [3 192683J 

(View GenBank repon.FASTA repon.ASN.1 report.Graphical view.2 MEDUNE 
links. 60 protein links. 26 nucleotide neighbors, or I genome link ) 

AF063097 



Bacteriophage P2, complete genome 
gi!3 l39086lgWAFD630971/ 



~7IAF063097 [3139086] 
(View GenBank report, FAST A repooASN.l report,Graphical view.2 1 MEDUNE 
links. 42 protein links. 3 nucleotide neighbors, or 1 genome link ) 

Z97974 

Bacteriophage phiadh I ys, hoi, tntG, rari.nnd tec genes 
gil2707950lemblZ97974JBPHIADH [2707950] 

fy!! w « GenBank . re P 0ftFASTA reporUVSN.l repon,Graphical view,2 MEDLINE 
links. 9 protein links, or 1 nucleotide neighbor ) 

AF059243 

Bacteriophage NL95, complete genome 
gil30885451gblAF059243lAP059243 [3088545) 

(View GenBank rcport.FASTA reportASN.i report,Graphical view.2 MEDUNE 
links, 4 protein links, 3 nucleotide neighbors, or 1 genome link ) 

AF0S243I 

Bacteriophage M 1 1 A-protein, coat protein. A 1 -protein, and replicase 
genes, complete cds 
gi!2981208{fib!AFQ52431l [2981208] 

(View GenBank reporuFASTA report^ SN.l report,Graphical view.2 MEDUNE 
links, 4 protein links, or 8 nucleotide neighbors ) 

Y07739 

Staphylococcus phage Twort hoJTW. plyTW genes 
gi!2764979lemWV07739IBrTWGHOLG [2764979] 
(View GenBank report^ASTA rcport.ASN.1 report,GraphicaI view, or 2 
protein links ) 

X9433I 
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. Bacteriophage Lcro, 24, c2, andcl genes 
gil 14692 13lemblX9433llBLCR024C [14692131 

(View GenBank reporU-ASTA report^ SN.l repoaGraphical view, I MEDLINE 
'0 link, or 4 protein links) 

X78410 

Bacteriophage phiadh holin and lysio geaes 
g iI7938481emblX784iaLGHOLLYS (793848) 
15 < Vicw GeoBank report.FASTA repor%ASN. 1 reporuGraphicaUiew.l MEDLINE 

link. 2 protein links, or t nucleotide neighbor ) 

X99260 

Bacteriophage B 103 genomic sequence 
gill429229icmblX99260IBBia3S [1429229] 
20 (View GenBank reportfttSTA reporUVSN. 1 report,Graphical vicw.l MEDUNE 

link, 17 protein links, or 12 nucleotide neighbors ) 

AJ000741 

Bacteriophage P] darA operon 
9 - gil2462938icmblAJ00074IIBPAJ7641 [2462938] 

£ ° (View GenBank reportfttSTA reporuASN.l report.Graphical vicw.l MEDUNE 

link, 10 protein links, or 3 1 nucleotide neighbors ) 

X87420 

Bacteriophage ES18 genes 24, c2, cro, cl. 18. and oL and oR operators 
30 gil 1 1434O71emblX87420tBPES18GEN [1 143407] 

(View GenBank rcporwFASTA reportASN.l repon,Graphical view.5 protein 
links, or 9 nucleotide neighbors ) 

L35561 

Bacteriophage phi -105 ORFs 1*3 
35 gilS32218IgWL3a56llPH50RFHrR [532218] 

(View GenBank report,FASTA reporCASN.l report,Graphical view.l MEDLINE 
link, or 3 protein links) 

D10027 

An Group II RNA ooliphage GA genome 

40 gWlf^d^DlOO^POAXX [217784] 

(View GenBank reporuFASTA rcportASN. 1 report,Graphical view.l MEDLINE 
link. 3 protein links. 5 nucleotide neighbors, or 1 genome link ) 

V01 128 

45 Bacteriophage phi-X 174 (cs70 mutation) complete genome 

gill5535temblV01 128IPHDCI74 [15535] 

(View GenBank report,FASTA reporUASN.l report.Grapatcal view.4 MEDUNE 
links, 1 1 protein links, or 26 nucleotide neighbors ) 
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S81763 



coal gene...rephcase gene (bacteriophage KUl, host^Escherichia coli, 
10 8 rou P n phage. Genomic RNA, 3 genes. 120 at) 

giII438766JgblS81763IS81763 [1438766] 

(View GenBank rcport.FASTA report.ASN. 1 rcport.Graphical view, or 1 
MEDLINE link) 

U38906 

15 Bacteriophage rl t mtegrase, repressor protein (no), dUTPase, noiin and 

lysin genes, complete cds 
gi!l3535l7lgUU3890fflBRU38906 [1353517] 

(View GenBank rcport.FASTA report^SN.I report, GraphicaJ view,2 MEDLINE 
links. 50 protein Jinks, or 3 nucleotide neighbors ) 



X91149 

Bacteriophage phi-C3 1 DNA cos region 
gilU07473lemblX91149IAPHIC3lC (1107473] 

(View GenBank reporUFASTA report.ASN.1 reporvGraphical view.l MEDLINE 
link. 6 protein links, or 1 nucleotide neighbor ) 



25 V00642 



phage MS2 gct__„ 

gill5081lemblV00642ILEMS2X [15081] 
(View GenBank reportJFASTA report.ASN.1 repoaGraphical view* MEDLINE 
links, 4 protein links, or 20 nucleotide neighbors ) 



30 V01146 



Genome of bacteriophage T7 
gil4311871emblV0M46IT7CG [431187] 

(View GenBank reporuFASTA report^VSN.l report,GrBphical view,l3 MEDLINE 
links, 60 protein links, 105 nucleotide neighbors, or I genome link ) 



35 X78401 



Bacteriophage P22 right operon. Off 48, replication genes 18 and 12. nin 
region genes, nioG phosphatase, late control gene 23, orf 60. complete* 



a, late control region, start of lysis gene 13 
giI512343!cmWX78401IPOP22NIN [512343] 

(View GenBank report,FASTA reporUASN.l ieport.Graphical vtew.2 MEDLINE 
40 links, 13 protein links, or 4 nucleotide neighbors ) 

Y00408 



Bacteriophage T4 gene t for lysis protein 
cil 15368lemWY0(M08IMYT4T [1068] 



(View GenBank report, FAST A reportASN.l report,Graphical view.l MEDLINE 
45 link. 1 protein link, or 3 nucleotide neighbors ) 
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Bacteriophage mv4 IvsA and lysB genes • 
gi)4IO5O0icmblZ2659O!MV4LYSAB (410500) 

(View GenBank repon.FASTA reporuASN. I reporuGraphical view, or 4 
10 protein links ) 

X07809 



Phage phi X 174 lysis (E) gene upstream region 
gill5094lemblX07809IMlPrnXE [15094] 
/5 . (View GenBank reporUFASTA reporUASN.l repoaGraphicaJ view.l MEDLINE 

link, 2 protein links, or 4 nucleotide neighbors ) 

234528 

LactococcaJ bacteriophage c2 l>*sin gene 
gii506455lemblZ34528ILBC2LYSIN (506455] 
20 (View GenBank report, FA STA reporuASN. 1 reporuGraphical view.l MEDLINE 

link, 1 protein link, or 4 nucleotide neighbors ) 

XI5031 

Bacteriophage f r RNA genome 
gill507llemblX15a31ILEBFRX [15071] 
25 (View GenBank report.FASTA reporuASN. 1 reporuGraphical view. 1 MEDLINE 

link, 4 protein links. 9 nucleotide neighbors, or 1 genome link ) 

X80191 

Bacteriophage PP7 mRNA for maturation, coaU lysis and replicase 
o/) proteins 

gil517237iemblX80191IBPF7PR [517237] 

(View GenBank report.FASTA reporuASN. 1 repon.Graphical view.l MEDLINE 
link, 4 protein links, or 1 genome link ) 

X85010 

35 Bacteriophage A51 1 ply5 1 1 gene 

giBS3748lemblX850101BPASl 1PLY [853748] 

(View GenBank report J A ST A reporuASN. 1 reporuGraphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 

X85009 

40 Bacteriophage A500 hol500 and ply 500 genes 

gil853744JemblXa5009IBPA500PLY [853744] 

(View GenBank report JFA ST A reporuASN. I reporuGraphical view, I MEDLINE 
link. 3 protein links, or 4 nucleotide neighbors ) 



45 



50 



X85008 

Bacteriophage A 1 18 holl 18 and ply 1 18 genes 
gil853740JemblX850081BPAl I8PLY [853740] 

(View GenBank reporUFASTA reporUASN.1 reporuGraphical view.l MEDLINE 
link, 3 protein links, or 1 nucleotide neighbor ) 
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235638 251 

Bacteriophage phi-X!74 genes for lysis protein and bcta-lactamase 
gil520996JcmblZ35638IBPLYSPR (520996] 

(View GenBank re port, FA ST A report.ASN.1 rcport,GrapbicaJ view, J MEDLINE 
link, 2 protein links, or 5)6 nucleotide neighbors ) 

J02459 

Bacteriophage lambda, complete genome 
ei!2l5104lgbU02459llj\Ma) [215104] 

(View GcnBank rcport,FASTA report.ASN.1 reporl,Graphicnl view .87 MEDLINE 
15 links, 67 protein links, 190 nucleotide neighbors, or 1 genome link ) 

X87674 



20 



Bacteriophage PI lydA & JydB genes 
gii974763lemWX87674IBACPlLYD [974763] 

(View GenBank report,FASr A report.ASN.1 reporuGraphical view. I MEDLINE 
link, 2 protein links, or 2 nucleotide neighbors ) 



X87673 



Bacteriophage PI gene 17 
gil97476llemt4X87673IBACPll7 [974761] 

(View GenBank rrportJASTA report,ASN. 1 report,Grapbical view, 1 MEDLINE 
25 link, 1 protein link, or 1 nucleotide neighbor ) 

M14784 

Bacteriophage T3 strain amNG220B right end, tail fiber protein, lysis 
protein and DNA packaging proteins, complete cds 
~ n Eil2158iagWM147841Ft3RE [215810] 

JU (View GenBank reportFASTA reportASN.l repoftGraphical view.l MEDLINE 

link, 9 protein links, or 10 nucleotide neighbors ) 

Ml 1813 

Bacteriophage PZA (from B.subtilis). complete genome 
35 «U216046lgbiMl 1813IPZACG [216046] 

(View GenBank reportFASTA reportASN.l report, Oraphical vicw3 MEDLINE 
links. 27 protein links, 17 nucleotide neighbors, or 1 genome link ) 

M16812 

Bacteriophage K3 Y lysis gene, complete cds 
40 gi!215503igblMl6812IPK3LYST [215503] 

(View GenBank reportFASTA reportASN.l rcporuGraphical view.l MEDLINE 
link, 1 protein link, or 4 nucleotide neighbors ) 

J043S6 

ac Bacteriophage P22 proteins 15 (complete cds). and 19 (3' end) genes 

45 gil2152651gblJ04356IP22l5P [215265] 
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35 



(View GciiBank rcport^ASTA repoxtASN.l repon,Graphical view.l MEDLINE 
link. 3 protein links, or 2 nucleotide neighbors ) 6 



*0 J04343 



Bacieriophage JP34 coat and lysis protein genes, complete cds. and 
replicase protein gene. 5' end 
g ii215076!gbU04343lJraCOLY [2 15076] 

(View GenBank repon,FASTA reportASN.l reportGraphicaJ view.l MEDLINE 
link, 3 protein links, or 2 nucleotide neighbors ) 

J02482 

• Bacteriophage phiXl74, complete genome 
giQ16019lgbU&482IPXICG {216019) 

(View GenBank reportFASTA reportASN.l reportGraphical view.23 MEDLINE 
links. 1 1 protein links, 26 nucleotide neighbors, or 1 genome link ) 

M99441 

Bacteriophage T4 anti-sigma 70 protein (asiA) gene, complete cds and 
lysis protein. 3* end 

gi!2 l5820lgbfM9O44 1 IPT4ASIA (215820] 
25 (View GenBank repoaFASTA reportASN.l reportGraphical view.3 MEDLINE 

<° l»nks, 2 protein links, or 2 nucleotide neighbors ) 

M65239 

Bacteriophage 2 1 lysis genes S, R, and Rz. complete cds 
gi215466lgbiM65239IPH2LVSGEN [215466] 
30 y*™ GMBank reportFASTA reportASN. 1 reportGraphicaJ view, I MEDLINE 

Lnk, 3 protein links, or 1 nucleotide neighbor ) 

M10637 



Phage G4 UfE overlapping gene system, encoding D (morphogenetic) and E 
(lysis) proteins 

gil215427lgbiM 106371PG4DE [215427] 

(View GenBank report ,FA ST A reportASN.l report.GraphicaJ view.l MEDLINE 
link. 2 protein links, or 12 nudeodde neighbors ) 



J02454 



dn Bacteriophage G4, complete genome 

w giQ15415lgbU02454lR54CG [215415] 

(View GenBank reportFASTA reportASN.l report,GraphJcaJ view.6 MEDLINE 
links, 1 1 protein links, 20 nucleotide neighbors, or 1 genome link ) 

J02580 

45 Bacteriophage PA-2 (Exoli porcine strain isolate) Rz gene, 5*cod; ORF2, 

outer membrane pori n protein 0c) and ORFl genes, complete cds 
gil2lS366IgbU02580IPA2IX: [215366] 

(View GenBank report,FACTA reportASN.l reportGraphical view.l MEDLINE 
link, 4 protein links, or 4 nucleotide neighbors ) 
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